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(57) Abstract 

The invention concerns hepatocyte growth factor (HGF) amino acid sequence variants. The preferred variants are resistant 
to proteolytic cleavage by en^es capable of in vivo conversion of HGF into its two-chain form and/or contain a mutation with- 
m the protease domain of HGF. 
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HEPATQCYTE GROWTH FACTOR V3U^IANTS 



BACKGROUND OF THE INVENTION 
I. Field of the Invention 

5 The. present invention concems amino acid i l ljubiilu variants of 

hepatocyte growth factor (HGP) , methods and means for preparing such 
variants, and pharmaceutical compositions comprising them. 

^1- Description Qg Background and Related Art 

^0 HGF was identified initially as a mitogen for hepatocytes 

[Hichalopoulos et al . . Cancer Res. 44 . 4414-4419 (1984) ; Russel et al . . 
«J. Ce^;i,, Physiol,. 309, 183-192 (1984) and Nakamura et al . . Biochem. 
Biophvs. Res. Comm. 122 . 14^0^1 A^g (1984)). Nakamura et al. . Supra 
reported the purification of HGF from the servim of partially 

15 hepatectomized rats. Subsequently, HGP was purified from rat 

platelets, and its subunit structure was determined [NsOcamura et al - . 
£y:oc. Natl. Acad. Sci. USA, 6489-6493 (1986); and Nakamura et al . . 

EBBS Letters 224, 311-316 (1987)] . The purification of human HGF 
(huHGF) from human plasma was first described by Gbhda et al . . J. 

20 Clin. Invest. 81 . 414-419 (1988) . 

Both rat HGP and huHGF have been molecular ly cloned, including 
the cloning and sequencing of a naturally occurring variant lacking 5 
amino acids designated "deltas HGF" tMiyazawa et al . . Biochem. 
Biophys. Res. Comm. 163, 967-973 (1989) ; Nakamura et al . . Nature 342 . 

25 440-443 (1989) ? Seki et al, Biochem. and Bionhvs . Res. Commun. 172. 

321-327 (1990); Tashiro et al . . Proc. Natl. Acad. Sci. nSA 87 . 3200- 
3204 (1990); Okajima e£_al^. Bur. J. Biochem. 193 . 375-381 (1990)]. 

The mature form of huHGF, corresponding to the major form 
purified from human serum, is a disulfide linked heterodimer derived 

30 by proteolytic cleavage of the human pro-hormone between amino acids 

R494 and V495. This cleavage process generates a molecule composed of 
an a- subunit of 440 amino acids (M^. 69 kDa) and a P- subunit of 234 
amino acids (M^ 34 )cDa) . The nucleotide sequence of the hHGF cDNA 
reveals that both the or- and the p- chains are contained in a single 

35 open reading frame coding for a pre -pro precursor protein. In the 

predicted primary structure of mature hHGF, an interchain S-S bridge 
is formed between Cys 487 of the or- chain and Cys 604 in the P- chain 
(see Nakamura et al , , Nature . supra ) . The N- terminus of the <y- chain 



* 



f 



wo 93/23541 PCr/US93/04648 

is preceded by 54 amino acids, starting with a methionine group. This 
segment includes a characteristic hydrophobic leader (signal) sequence 
of 31 residues and the prosequence. The of-chain starts at amino acid 
(aa) 55, and contains four Kringle domains. The so called "hairpin 
5 domaiii" includes amino acid residues 70-96 of wild- type human HGF. 
The Kringle 1 domain extends from eibout aa 128 to about aa 206, the 
Kringle 2 domain is between about aa 211 and about aa 288, the Kringle 
3 domain is defined as extending from about aa 303 to about aa 383, 
and the Kringle 4 domain extends from about aa 391 to cU30ut aa 464 of 
10 the a- chain • It will be understood that the definition of the various 
Kringle domains is based on their homology with kringle -like domains 
of other proteins (prothrombin, plasminogen) , therefore, the above 
limits are only approximate. As yet, the f\inction of these Kringles 
has not been determined. The p- chain of huHGP shows high homology to 
15 the catalytic domain of serine proteases (38% homology to the 

plasminogen serine protease domain) . However, two of the three 
residues which form the catalytic triad of serine proteases are not 
conserved in huHGF. Therefore, despite its serine protease -like 
domain, hHGF appears to have no proteolytic activity and the precise 
20 role of the chain remains unlcnown. HGF contains four putative 

glycosylation sites, which aire located at positions 294 and 402 of the 
a- chain and at positions 566 and 653 of the chain. 

In. a portion of cDNA isolated from human leukocytes in -frame 
deletion of 15 base pairs was observed. Transient expression of the 
25 cDHA sequence iii COS-1 cells revealed that the encoded HGF molecule 
(deltas HGF) lacking 5 amino acids in the Kringle 1 domain was fully 
functional (Seki et al , , supra ) . 

A naturally occurring hviHGF variant has recently been identified 
which corresponds to an alternative spliced form of the huHGF 
30 transcript containing the coding sequences for the N- terminal finger 
and first two kringle domains of mature huHGF [Chan et al . . Science 
254 , 1382-1385 (1991) 7 Miyazawa et al. , Eur. J. Biochem. 197 . 15-22 
(1991)3 . This variant r designated HGF/NK2, has been proposed to be a 
competitive antagonist of mature huHGF. 
35 The comparison of the amino acid sequence of rat HGF with that of 

huHGF revealed that the two sequences are highly conserved and have 
the same characteristic structural features. The length of the four 
Kringle domains in rat HGF is exactly the same as in huHGF. 
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Furthermore, the cysteine residues are located in exactly the same 
positions; an indication of similar three-dimensional stzructures 
(Okajima et al . . supra ; Tashiro et al . . suora ) . 

The HGF receptor has been identified as the product of the c-Met 
_ 5 proto- oncogene [Bottaro et al . . Science 251 . 802-804 (1991) ; Naldini 

al., OncoqenQ 1, 501-504 (1991)3, an 190-kDa heterodimeric (a 
^sulfide -linked 50-kDa of-chain and a 145-kDa P-chain) membrane- 
spanning tyrosine kinase protein [Park et al . . Proc. Natl. Acad. Sci. 
USA 6379-S383 (1987)] . The c-Met protein becomes phosphorylated on 

10 tyrosine residues of the 145-kDa P-subunit upon HGP binding. 

The levels of HGF increase in the plasma of patients with hepatic 
failure (Gohda et al.. suora) and in the plasma [Lindroos et al . . 
Hepatol . la, 734-750 (1991)] or serum [Asami et al . . J. Bioehem- 109 . 
8-13 (1991)] of animals with experimentally induced liver damage. The 

15 kinetics of this response is rapid, and precedes the first round of 

DNA synthesis during liver regeneration suggesting that HGF may play a 
key role in initiating this process. More recently, HGF has been 
shown to be a mitogen for a variety of cell types including 
melanocytes, renal txibular cells, keratinocytes , certain endothelial 

20 cells and cells of epithelial origin [Matsumoto et al . . Biochem 
BioPhvB. Res- Commiiii- 45-51 (1991) ; Igawa et al . . Biochem. 

Biophys. Res. Cominun, i24, 831-838 (1991) ; Han et al . . Biochem. 30, 
9768-9780 (1991); Rubin et_al^, Proc. Natl. Acad. Sci. USA aa. 415-419 
(1991)]. Interestingly, HGF can also act as a "scatter factor", an 

25 activity that promotes the dissociation of epithelial and vascular 

endothelial cells in vitro [Stoker et al . . Nature 327 . 239-242 (1987); 
Weidner'.&r. al., J.. Cell Biol, in. 2097-2108 (1990); Naldini et al . . 
E^BQ J.. 1^, 2867-2878 (1991)]. Moreover, HGF has recently been 
described as an epithelial morphogen [Montesano et al. . Cell 67 . 901- 

30 908 (1991)]. Therefore, HGF has been postulated to be important in 
tumor invasion and in embryonic development. Chronic c-Met/HGP 
receptor activation has been observed in certain malignancies [Cooper 
et_alj., EMBO J. 5, 2623 (1986) ; Giordano et al. . Nature 339 . 155 
* (1989)]. 

^.^ would be desirable to better understand the structure -activity 

•* relationship of HGP in order to identify functionally important 

domains in the HGF amino acid sequence. 



3 
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It would be particularly desirable to identify the amino acid 
residue^ which are. responisxble for the interaction of H6F with its 



Xt would be also desirable to identify the amino acid residues 
5 which are. responsible for H6F biological, activity. 

Xt would further be desirable to provide amino acid sequence 
variants of HGF that have altered (preferably enhanced) receptor 
binding affinity as compared to the corresponding mature, wild- type 
HGF. 

10 Xt would also be desirable to provide HGF amino acid sequence 

variant? which have retajLned or enhanced receptor binding affinity as 
cosipared to the corresponding wild* type HGF, but are substantially 
devoid of HGF biological activity. Such molecules could act as 
contpetitive antagonists of HGF action. 

15 Xt would further be desirable to provide HGF amino acid sequence 

variants that have retained or enhanced receptor binding affinity and 
increased biological activity as compared to the corresponding wild- 
type HGF (HGF agonists) . Accordingly, it is an object of the 
present invention to provide HGF variants having retained or iznproved 

m 

20 the receptor binding atffinlty of the corresponding mature wild- type 
HGF. . Xt is euaother object of the. invention to provide HGF 
variants that have retained substantially full receptor binding 
affinity of the corresponding mature wild- type HGF and are 
sisbsteuitlally Incapable of HGF receptor activation. Xt is a further 

25 object to provide HGF variants that have retsiined substantially full 
receptor binding affinity of the corresponding matwe wild- type HGF 
and have inproved biological properties. 

These and further objects will be apparent to one of ordinary 
skill in the art. 

30 

SDMMARY OF THE INVENTION 
The foregoing objects are achieved by the provision of HGF 
variants having amino acid alterations within various domains of the 
wild- type HGF amino acid sequence. 
35 In one aspect, HGF variants are provided that are resistant to 

proteolytic cleavage by enzymes that are capable of in vivo conversion 
of HGF into its two -chain form. The variants are preferably 
stabilized in single -chain form by site directed mutagenesis within a 
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region recognized by an enzyme capable of converting HGF into its two- 
chain form. 

In a particular embodiment, such variants have an amino acid 
alteration at or adjacent to amino acid positions 493, 494, 495 or 496 
5 of the wild- type huHGP amino acid sequence. The alteration preferably 
is the substitution of at least one amino acid at amino acid positions 
493-496 of the wild-type huHGF amino acid sequence. 

In another embodiment, the variants retain substantially full 
receptor binding affinity of the corresponding wild- type HGF and are 
10 substantially incapable of HGF receptor activation. HGF variants with 
enhanced receptor binding affinity and substantially lacking the 
ability to activate the HGF receptor are particularly preferred. Such 
compoimds are coinpetitive antagonists of the corresponding wild-type 
HGF and, when present in sufficient concentration, are capable of 
15 inhibiting the binding of their wild- type counterparts to their 
ligands . 

In. another aspect, here are provided HGF variants having an amino 
acid alteration at a site within the protease domain of HGF and 
retaining substantially full receptor binding affinity of the 

20 corresponding wild- type HGF. 

In a specific embodiment, these variants have substantially 
retained or iznproved receptor binding affinity as compared to the 
corresponding wild- type HGF, and are substantially devoid of HGF 
biological activity. Such compounds, if present in sufficient 

25 concentration, will act as cocnpetitive antagonists of HGF action. 

In another specific embodiment, the variants combine 
s\a3stantially retained or intproved receptor binding affinity with 
inproved biological activity, as compared to the corresponding wild- 
type HGF. Such variants are valuable as HGF agonists. 

30 In a preferred embodiment, the HGF variants within this group 

comprise an alteration in a region corresponding to the catalytic site 
of serine proteases. More preferably the alteration is at or 
adjacent to any of positions 534, 673 and 692 of the wild-type hvrnian 
HGF (huHGF) amino acid sequence. 

35 The alteration preferably is substitution. 

In a particularly preferred embodiment, at least two of the- 
residues at amino acid positions 534, 673 and 692 of the wild- type 
huHGF sequence are replaced by another amino acid. 



wo 93/23541 PC:r/US93/04648 

In a preferred groiip of the HGF variants herein, both tyrosine 
(Y) at position 673 and valine (V) at position 692 of the huHGF 
sequence a^e replaced by another amino acid. This alteration 
potentially yields HGF variants which substantially retain the 
5 receptor binding affinity of wild-type huHGF but are substantially 
devoid of HGF biological activity. 

The mutations around the one-chaiix to two -chain cleavage site and 
within the protease domain may be advantageously combined for improved 

biologiceOL properties. 

10 Variants with increased receptor binding affinity as con5)ared to 

the corresponding wild- type HGF are particularly preferred. The 
increase in receptor binding affinity may, for example, be 
accon^ilished by an alteration in the receptor -binding domain of the 
wild- type HGF amino acid sequence, and preferably within the Kringle 1 

15 domain. . 

Kringle 1 variants with amino acid alterations within the patch 
defined by amino acid positions 159, 161, 195 and 197, or at amino 
acid position X73 of the wild- type h\iHGF amino acid sequence axB 
particulcurly preferred, but other positions within the Kringle l 
20 domain have also been identified as having a genuine effect on the 
receptor binding properties and/or the specific activity of HGF. 

Furthermore, amino acid sequence variants with alterations at 
amino acid positions preceding the Kringle 1 domain, in particular 
those just N- or C- terminal to the hairpin domain, have been found to 
25 have signif iceuitly different binding properties and biological 
activity from those of the corresponding wild- type HGF. 

The variants of this invention may be devoid of functional 
Kringle 2 and/or Kringle 3 and/or Kringle 4 domains. 

In all embodiments, hiaHGF amino acid secjuence variants are 

3D preferred. 

In other embodiments, the invention relates to DHA. sequences 
encoding the variants described above, replicable expression vectors 
containing and capable of expressing such PNA, sequences in a 
transformed host cell, transformed host cells, and a process 

35 comprising culturing the host cells so as to express the DNAs encoding 
the HGF variants. 
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In yet another embodiment, the invention relates to therapeutic 
compositionfi comprising HGF variants having HGF agonist or antagonist 
properties . 

BRIEF DESCRIPTION OF THE DRAWINGS 
5 Figure 1 is a schematic representation of the a- and P-subunits 

of huHGF. Shown in the or- chain are the signal sequence (boxed region) 
which encontpasses amino acids 1-31, the predicted finger and four 
Kringle domains, each with their respective three disulfide bonds. The 
cleavage site for generation of the heterodimeric a/p form of huHGF 

10 immediately follows the PI cleavage residue R494 . This last residue 
has been specifically substituted with either E, D or A to generate 
HGF single-chain variants. The P-chain, which follows the cleavage 
site, contains homology to serine proteases. It is proposed that the 
a- and P- chains are held together by a unique disulfide -bridge between 

15 C487(o) and C604 (p) (Nakamura et al . , 1989, supra ) . Three residues 
within the p-chain have been substituted individually or in 
combination to reconstitute the authentic residues of a serine- 
protease. Schematic representations of the mature forms of the C- 
terminal truncation variants are depicted below: N-207, deleted after 

20 the first Kringle; N-303, deleted after the second Kringle; N-384, 

deleted after the third Kringle and the a- chain. Also shown are the 
variants where deletions of each of the Kringles (AKl, AK2, AK3 and 
AK4) were introduced. In each case, the deletions specifically remove 
the entire Kringle from CI to C6 . 

25 Figure 2 shows the results of Western blot of wild- type rhuHGF 

and single-chain variants. Conditioned media from mock transfected 
293 cells or stable 293 cells expressing either wild- type rhuHGF (WT) 
or the variants R494E, R494A or R494D were fractionated under reducing 
conditions on an 8% sodium- dodecyl sulfate -polyacrylamide gel and 

30 blotted. The blot was reacted with polyclonal anti-HGF antisera which 
recognizes epitopes primarily in the cf- chain. Molecular masses 
(kilodaltons) of the marker are as indicated. Also indicated are the 
positions of the of- chain and uncleaved single -chain forms of huHGF. 
Note that the polyclonal antibody cross -reacts with an unidentified 

35 band (*) present even in the control transfected 293 cells, which do 
not express detectable quantities of huHGF. 

Figure 3: Mitogenic activity (A) and competitive receptor binding 
(B) of wild- type (WT) rhuHGF and single-chain variants. (A) Biological 
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activity was determined by the ability of WT rhiiHGF and variants to 
induce DNA synthesis of rat hepatocytes in primaiy culture as 
described in Example 2. Shown are the mean cpxn from duplicates in a 
representative assay. Mock supernatant from control cells did not 
5 stimulate DHA. synthesis in these cells (no cpm increase above 

background levels) . (B) To perform coxnpetitive binding, various 
dilutions of supematants of human 293 cells containing wt rhuHGF or 
variants were incubated with 50 pM of the huHGF rec^tor-IgG fusion 
protein as described in Example 2. Data represent inhibition of 
10 binding as the percentage of any competing ligand from a 

representative experiment and were corrected by subtraction of 
background values from pontrol 293 cells. 

Figure 4: Western blot of ligand- induced tyrosine -phosphorylation 
on the 145 kDa P-subunit of the HGF receptor by wild- type rhuHGF, 
15 single- chain or proteaise domain huHGF variants. Lysates frcxti A549 
cells incubated for 5 minutes without (-) or with 200 ng/ml. of 
purified wt rhioHGF (WT) , single-chain (R494E) or double protease 
vsuriants (Y673S,V692S) were prepared and immunoprecipitated with an 
anti-.HGF receptor antibody and blotted with anti-phosphotyrosine 
20 antibodies. Molecular, masses (kilodaltons) are as indicated. 

Figure 5 depicts the nucleotide sequence encoding the plasmid 
pRKS.l (SEQ. ID. NO: 1). 

Figure 6 depicts the nucleotide sequence encoding the plasmid 
p. CIS. EBON (SBQ. ID. NO: 15) . 
25 DETAIIiED DESCRIPTION OF THE INVENTION 

I, Definitions 

As used herein, the terms "hepatocyte growth factor", "HGF" and 
"huHGF" refer to a (human) growth factor capable of specific binding 
to a receptor of wild- type (human) HGF, which growth factor typically 
30 has a structure with six domains (finger, Kringle 1, Kringle 2, 

Kringle 3, Kringle 4 and serine protease domains) , but nonetheless may 
have fewer domains or may have some of its domains repeated if it 
still retains its qualitative HGF receptor binding ability. This 
definition specifically includes the deltas huHGF as disclosed by Seki 
35 et al . . • supra . The terms "hepatocyte growth factor" and "HGF" also 
include hepatocyte groiirth factor from any non -human animal species, 
and in particular rat HGF. 
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The terms "wild- type human hepatocyte growth factor", "native 
human hepatocyte growth factor", "wild- type huHGF", and "native hiaHGP" 
refer to native sequence human HGF, i.e., that encoded by the cDNA 
sequence published by Miyazawa, et al. 1989, supra , or Nakamura et 
5 al_s., 1989, ^upra^ including its mature, pre, pre -pro, and pro forms, 
purified from natural source, chemically synthesized or recombinantly 
produced. The sequences reported by Miyazawa et al . and Nakamura et 
al^ differ in 14 amino acids. The reason for the differences is not 
entirely clear; polymorphism or cloning artifacts are among the 

10 possibilities. Both sequences are specifically encompassed by the 

foregoing terms as defined for the purpose of the present invention. 
It will be understood that natural allelic variations exist and can 
occur among individuals, as demonstrated by one or more amino acid 
differences in the amino acid sequence of each individual. Amino acid 

15 positions in the variant huHGP molecules herein are indicated in 
accordance with the numbering of Miyazawa e_t_ al . 1989, supra . 

The terms "(HGF) biological activity", "biologically active", 
"activity" and "active" refer to any mitogenic, motogenic or 
morphogenic activities exhibited by wild- type human HGF. The HGF 

20 biological activity may, for example, be determined in an in vitro or 
in vivo assay of hepatocyte growth promotion. Adult rat hepatocytes 
in primary culture have been extensively used to search for factors 
that regulate hepatocyte proliferation. Accordingly, the mitogenic 
effect of an HGF variant can be conveniently determined in an assay 

25 suitable for testing the ability of an HGF molecule to induce DNA 
synthesis of rat hepatocytes in primary cultures, such as, for 
exanqple, described in Example 2. Human hepatocytes are also available 
from whole liver perfusion on organs deemed unacceptcQjle for 
transplantation, pare -downs of adult livers used for transplantation 

30 in children, fetal livers and liver remnants removed at surgery for 

other indications. Human hepatocytes can be cultured similarly to the 
methods established for preparing primary cultures of normal rat 
hepatocytes. Hepatocyte DNA synthesis can, for example, be assayed by 
• measuring incorporation of I ^H] thymidine into DNA, with appropriate 

35 hydroxyurea controls for replicative synthesis. 

The effect of HGF variants on hepatocyte growth can also be 
tested in vivo in animal models of liver dysfunction and regeneration, 
such as in rats following partial hepatectomy, or carbon tetrachloride 
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caused hepatic injury, in D-galactos amine induced acute liver failure 
models, etc. According to a suitable protocol, a liver poison, e.g. 
a-naphthylisothiocyanate (ANIT) is administered to rats in a 
predetermined concentration capable of causing reproducible 
5 significant elevation of liver enzyme and bilirubin levels. The rats 
are then treated with the HGF variant to be tested, sacrificed and the 
liver enzyme and bilirubin levels are determined. The livers are 
additionally observed for hepatic lesions. 

The ea^ression "retaining substantially full receptor binding 
affinity of wild- type (hu)HGF" and grammatical variant thereof as used 
herein mean that the receptor binding affinity of the HGF variant is 
not less then about 70V; preferably not less than about 80%, more 
preferably not less than about 90%, most preferably not less than 
about 95% of the affinity with which wild- type (hu)HGF binds its 

native receptor* 

The terms "substantially incapable of HGF receptor activation" 
and "substantially devoid of HGF biological activity" mean that the 
activity exhibited by an HGF vsuriant is less than about 20%, 
preferably less than about 15%, more preferably less than about 10%, 
most preferably less than about 5% of the respective activity of wild- 
type (human) HGF in an established assay of receptor activation or HGF 
biological activity, as hereinabove defined. 

The terms "amino acid" and "amino acids" refer to all naturally 
occurring L-a-amino acids. This definition is meant to include 
norleucine, omithiner and homocysteine. The amino acids are 
identified by either the single- letter or three- letter designations: 



15 



20 



25 



30 



35 



Asp 


D 


aspaxtic acid 


lie 


I 


isoleucine 


Thr 


T 


threonine 


Leu 


L 


leucine 


Ser 


S 


serine 


Tyr 


Y 


tyrosine 


Glu 


E 


glutamic acid 


Phe 


F 


phenylalanine 


Pro 


P 


proline 


His 


H 


histidine 


Gly 


G 


glycine 


Lys 


K 


lysine 


Ala 


A 


alanine 


Arg 


R 


arginine 


Cys 


C 


cysteine 


Trp 


W 


tryptophan 


Val 


V 


valine . 


Gin 


Q 


glutamine 


Met 


H 


methionine 


Asn 




asparagine 



These amino acids may be classified according to the chemical 
coir?)osition and properties of their side chains . They are broadly 
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claGBified into two groups, charged and uncharged. Each of these 
groups is divided into subgroups to classify the amino acids more 
accurately: 

I . Charged Amino Acids 

5 Acidic Residues : aspartic acid, glutamic acid 

Basic Residues : lysine, arginine, histidine 

II. Uncharged Amino Acids 

Rvdrophilie Residues; serine, threonine, asparagine, 
glutamine 

Aliphatic Residues ; glycine, alanine, valine, leucine, 
isoleucine 

Won -polar Residues : cysteine, methionine, proline 
Aromatic Residues : phenylalanine, tyrosine, tryptophan 
The terms "alteration", "amino acid alteration", "variant" and 
15 "amino acid sequence variant" refer to HGF molecules with some 

differences in their amino acid sequences as compared to wild- type 
(human) HGF. Ordinarily, the variants will possess at least about 80% 
homology with those domeujis of wild-type (himian) HGF that are retained 
in their structure, and preferably, they will be at least about 90% 
20 homologous with such domains. 

Substitutional HGF variants are those that have at least one 
amino acid residue in the corresponding wild-type HGF sequence removed 
and a different amino acid inserted in its place at the same position. 
The substitutions may be single, where only one amino acid in the 
25 molecule has been sxibstituted, or they may be multiple, where two or 
more amino acids have been substituted in the same molecule. 

Substantial changes in the activity of the HGF molecule may be 
obtained by substituting an amino acid with a side chain that is 
significantly different in charge and/or structure from that of the 
30 native amino acid. This type of substitution would be expected to 

affect the structure of the polypeptide backbone and/or the charge or 
hydrophobicity of the molecule in the area of the substitution. 

Moderate changes in the activity of the HGF molecule would be 
expected by substituting an amino acid with a side chain that is 
35 similar in charge and/ or structure to that of the native molecule. 

This type of substitution, referred to as a conservative substitution, 
would not be expected to substantially alter either the structure of 
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the polypeptide backbone or the charge or hydrophobicity of the 
molecule in the area of the siibstitution. 

Insertional HGF variants are those with one or more amino acids 
inserted immediately adjacent to an amino acid at a particular 
5 position in the wild- type HGF molecule. Immediately adjacent to an 
amino acid meeuis connected to either the ot-carboxy or or* amino 
functional group of the amino acid. The insertion may be one or more 
amino acids. Ordinarily, the insertion will consist of one or two 
conservative amino acids. Amino acids simileir in charge and/or 
10 structure to the amino acids adjacent to the site of insertion are 
defined as conservative- Alternatively, this invention includes 
insertion of an amino acid with a charge and/or structure that is 
substantially different from the cunino acids adjacent to the site of 
insertion. 

15 Deletional variants are those with one or more amino acids in the 

wild- type HGP molecule removed. Ordinarily, deletional variants will 
have one or two amino acids deleted in a particular region of the HGP 
molecule . 

The notations used throughout this application to describe huHGF 
20 amino acid sequence variants are described below. The location of a 
particuleir amino acid in the polypeptide chain of huHGF is identified 
by a number. The number refers to the amino acid position in the 
amino acid sequence of the mature, wild- type human HGF polypeptide as 
disclosed in Miyazawa et al . , 1989, supra . In the present application, 
25 similarly positioned residues in huHGF variants are designated by 

these nimiberB even though the actual residue number is not so nimibered 
due to deletions or insertions in the molecule. This will occur, for 
example^ with site -directed deletional or insertional variants. The 
amino acids are identified using the one -letter code. Substituted 
30 amino acids are designated by identifying the wild- type amino acid on 
the left side of the number denoting the position in the polypeptide 
cli?-in of that amino acid, and identifying the substituted amino acid 
on the right side of the number. 

For example, replacement of the amino acid arginine (R) by 
35 glutamic acid (E) at amino acid position 494 of the wild-type huHGF 
molecule yields a hxaHGF variant designated R494E huHGF. Similarly, 
the huHGF variant obtained by substitution of serine (S> for tyrosine 
(Y) at amino acid position 673 and serine (S) for valine (V) at amino 

12 
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acid position 692 of the wild-type huHGF molecule is designated 
Y673S,V692S huHGP. 

Deletional vcuriants are identified by indicating the amino acid 
residue and position at either end of the deletion, inclusive, and 
5 placing the Greek letter delta, "A", to the left of the indicated 

amino acids . Deletion, of a single amino acid is indicated by placing 
A to the left of the single letter code and nxamber indicating the 
position of the deleted amino acid. 

Insertional variants are designated by the use of brackets " [) " 
10 around the inserted amino acids, and the location of the insertion is 
denoted by indicating the position of the amino acid on either side of 
the insertion. 

The alterations in the amino acid sequence of the HGF variants 
herein are indicated with reference to amino acid positions in the 
15 wild-type human HGF amino acid sequence. (Miyazawa et al . . supra ) . 
Methods for the alignment of homologous amino acid sequences from 
various species are well known in the art. 

The terms "DMA sequence encoding", "DNA encoding" and "nucleic 
acid encoding" refer to the order or sequence of deoxyribonucleotides 
•20 along a strand of deoxyribonucleic acid. The order of these 

deoxyribonucleotides determines the order of amino acids along the 
polypeptide chain. The DNA sequence thus codes for the amino acid 
sequence . 

The terms "replicable expression vector" and "expression vector" 
25 refer to a piece of DNA, usually double -stranded, which may have 

inserted into it a piece of foreign DNA. Foreign DNA is defined as 
heterologous DNA, which is DNA not naturally found in the host cell. 
The vector is used to transport the foreign or heterologous DNA into a 
suitable host cell. Once in the host cell, the vector can replicate 
30 independently of the host chromosomal DNA, and several copies of the 
vector and its inserted (foreign) DNA may be generated, in addition, 
the vector contains the necessary elements that permit translating the 
foreign DNA into a polypeptide. Many molecules of the polypeptide 
encoded by the foreign DNA can thus be rapidly synthesized. 

the context of the present invention the expressions "cell", 
"cell line", and "cell culture" are used interchangeably, and all such 
designations include progeny. 
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The ^erms "transformed (host:) cell'' , "transfonoant." and 
"transformed" refer to the introduction of DKA. into a cell. The cell 
is termed a "host cell". The introduced DKA. is usually in the form of 
a vector containing an inserted piece of DN2V. The introduced X>^A 
5 sequence may be from the same species as the host cell or a different 
species from the host cell, or it may be a hybrid DNft sequence, 
containing some foreign and. some homologous DNA.. The words 
transforroants and transformed (host) cells include the primary subject 
cell and cultures derived therefrom, without regard to the number of 
10 transfers. Xt is also \inderstood that all progeny may not be 

precisely identical in DNA. content, due to deliberate or inadvertent 
mutations. Mutant progeny that have the same function or biological 
property as screened for in the originally transformed cell are 
included. 

15 The technique of "polymerase chain reaction" or "PGR", as used 

herein, generally refers to a procedure wherein minute amounts of a 
specific piece of nucleic a.cid, SNA and/or DKZ\., are an^lif ied as 
described in U.S. Patent No. 4,683,195, issued 28 July 1987 and in 
Current Protocols in Molecular Biology . Aus\2bel et al> eds. , Greene 

20 Publishing Associates and Wiley- Xnterscience 1991, Volume 2, Chapter 
15. 

The term "monoclonal antibody" as used herein refers to an 
antibody obtained from a population of substantially homogeneous 
antibodies, i.e., the individual antibodies coinprising the population 

25 are identical except for possible naturally occurring mutations that 
may be present in minor amounts.. Thus, the. modifier "monoclonal" 
indicates the character of the antibody as not being a mixtwe of 
discrete antibodies. The monoclonal antibodies include hybrid and 
recombinant antibodies produced by splicing a variable (including 

30 hypervariable) domain of an anti-selectin ligand antibody with a 

constant domain (e.g. "himianized"^ antibodies) , only one of which is 
directed against a selectin, or a light chain with a heavy chain, or a 
chain from one species with a chain from another species, or fusions 
with heterologous proteins, regardless of species of origin or 

35 immunoglobulin class or subclass designation, as well as antibody 

fragments (e.g.. Fab, F(ab')2' Fv) . Cabilly, et al . , U.S. Pat. 

No. 4,816,567; Mage & Lamoyi, in Monoclonal Antibody Production 
Techniques and Applications , pp. 79 -9 7 (Marcel Dekker, Inc., New York, 
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1987) . Thus, the modifier "monoclonal" indicates the character of the 
antibody as being obtained from such a substantially homogeneous 
population of antibodies, and is not to be construed as requiring 
production of the antibody by any particular method. 
5 The term "immunoglobulin" generally refers to polypeptides 

comprising a light or heavy chain usually both disulfide bonded in the 
native "Y" configuration, although other linkage between them, 
including tetramers or aggregates thereof, is within the scope hereof. 

^0 Immunoglobulins (Ig) and certain variants thereof are known and 

many have been prepared in recombinant cell culture. For example, see 
U.S. Patent 4,745,055; EP 256,654; Faulkner et al . . Nature 298:286 
(1982); EP 120,694; EP 125,023; Morrison, J. Immun, 123.791 (1979); 
Kdhler et; al., Proc. Nat'l. Aca d. Sci . USA 72:2197 (1980) ; Ras o et 

q^CQy Res, 11:2073 (1981); Morriso n et al . . Ann. Rev, ImmunQl . 
2:239 (1984); Morrison, Science 22^:1202 (1985); Morriso n et al . . 
Proc, Natjl. Acad. Sci. USA Sl.fiftRi (1984); EP 255,694; EP 266,663; 
and WO 88/03559. Reasserted immunoglobulin chains also are known. 
See .for example U.S. patent 4,444,878; WO 88/03565; and EP 68,763 and 

20 references cited therein. The immunoglobulin moiety in the chimeras 
of the present invention may be obtained from IgG^^, IgG^, 1963, or 
IgG^ subtypes, IgA, IgE, IgD or IgM, but preferably IgG^ or IgGg . 

II. Selection of the HGF Variants 
25 The present invention is based upon the study of structure- 

activity and structure -receptor binding relationship in amino acid 
sequence variants of HGF. 

Certain HGP variants of the present invention are resistant to 
proteolytic cleavage by enzymes that are capable of in vivo conversion 

30 of the single-chain HGF proenzyme into its two-chain form. Such 
enzymes are trypsin- like proteases. Absent alterations, the 
proteolytic cleavage takes place between Arg494 and Val495 of the 
wild-type huHGF sequence. The resistance to proteolytic cleavage is 
preferably achieved by site-directed mutagenesis within a region 

35 recognized by an enzyme capable of converting HGF into its two- chain 
form, and preferably within the Leu-Arg-Val-Val (LRW) sequence at 
amino acid residues 493-496 of the wild- type huHGF sequence. The 
variants herein may, for example, contain single or multiple amino 
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acid substitutions, insertions or deletions at or adjacent to amino 
acid poBXtions 493, 494, 495 and 496 in the wild- type human HGF amino 
acid sequence. 

A preferred alteration is the replacement of arginine at amino 
5 acid position 494 with any other amino acid, preferably glutamic acid, 
aspartic acid or alanine. In general/ the substitution of smaller, 
apolar or acidic amino acids for arginine at this position is believed 
to yield single -chaiii HGF variants. 

Alternatively or in additiour the replacement of valine at 
10 position 495 by another amino acid is eacpected to block the one -chain 
to two -chain cleavage. Bulkier amino acids, such as tyrosine, 
phenylalanine, etc. are preferred for substitution at this position. 

Other HGF variants of the present invention are altered at a site 
within the protease domain of HGF and retain siibstantially full 
15 receptor binding affinity of the corresponding (preferably human) 
wild- type HGF. The protease domain follows the cleavage site 

between amino acid positions 494 and 495 in the wild-type hxoHGF 
sequence, and shows a high degree of homology with the catalytic ~~ 
domain of known serine proteases. . The conservation does not apply to 
20 the active site of serine proteases. In human plasmin, which is 

formed from its proenzyme, plasminogen, residues His -42, Asp- 85 and 
Ser-181 form the catalytic site (catalytic triad) . This catalytic 
triad is highly conserved in serine proteases^ In the huHGF amino 
acid sequence asparagine is retained at amino acid position 594, 
25 however, position 534 (corresponding to position 42 of plasmin) is 
occupied by glutamine instead of histidine, and position 673 
(corresponding to position 181 of plasmin) by tyrosine instead of 
serine. A preferred group of the protease domain alterations herein 
involves one or both of amino acid positions 673 and 534 . 
30 Alternatively, or in addition, the ailteration may be at position 692 
of the hi2HGF amino acid sequence. In all instances, the alteration 
preferably is the substitution of one or more different amino acids 
for the residues at these positions of the native h\2HGF amino acid 
sequence • 

35 Tyrosine at amino acid position 673 is preferably replaced, by an 

amino acid which has no bull^ aromatic or heterocyclic moieties. Such 
amino acids include serine, threonine, aspEuragine, cysteine, glycine. 
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alanine and valine. In the preferred variants, serine is substituted 
for tyrosine at this position. 

Valine at amino acid position 692 preferably is substituted by a 
polar amino acid, such as serine, threonine, asparagine or glutamine, 
5 preferably serine. 

In a preferred group of the HGF amino acid sequence variants 
herein, both position 673 and position 692 are substituted by one of 
the foregoing amino acids, preferably serine. Such variants may 
additionally contain an alteration (preferably substitution) at amino 
10 acid position 534. The latter alteration may be the substitution of 
histidine for glutamine in the wild-type huHGP amino acid sequence. 

The single, double or triple mutations within the protease domain 
may be combined with additional alterations in the wild- type HGF amino 
acid sequence. Such further alterations may, for example, be at or 
15 around the one-chain to two chain cleavage site of the HGF molecule, 
as hereinabove described, and may result in variants which are 
substantially in single-chain form. — 

Additional alterations may be at the C- terminal end and/or in the 
Kringle • domains of the HGF molecule. In addition to the deletion 
20 mutants disclosed in the examples, HGF variants with alterations 

within the Kringle 1 domain are of great interest. As we have found 
that the receptor binding domain is contained within the finger and 
the Kringle 1 regions of the HGF molecule, amino acid alterations 
within these domains are expected to significantly alter the receptor 
25 binding properties (and the biological activity) of the variants of 

the present invention. Alterations at residues that are most exposed 
to the interior in the Kringle structure (mostly charged residues) are 
particularly likely to cause prof ound changes in the receptor binding 
properties and/or biological activity of the HGF variants. 
3^ Alterations within the Kringle l domain preferably are within the 

patch defined by amino acid positions 159, 161, 195 and 197 of the 
wild- type huHGF amino acid sequence or at corresponding positions in a 
non-human HGF amino acid sequence. Another preferred site for amino 
acid alteration is at position 173 of the wild- type huHGF amino acid 
35 sequence. The latter position is at the opposite side as compared to 
the surface defined by amino acid position 159, 161, 195 and 197 and 
the reasons for its involvement in the binding properties and 
biological activity of HGF have not yet been fully identified. 
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Some illustrative huHGF variants within the scope herein are as 
follows: R494E; R494D; R494A; V495Y; V4d5F; R4d4£, V495Y^ R494E, 
V495Ff R494D, V495Y.; R494D, V495F; R494A, V495Y; R494A, V495F; 
R494CE1V495; R494[DlV495; R494[A]V495; R494LY3V495; R494[F]V495; 
5 R494E, Q534Hf R494E, Y673S; R494E, V692S; R494D, Q534Hf R494D, Y673S; 
R494D, V692S, R494A, Q534H; R494A, V673S; R494A, V692S^ R494E, Y673S, 
V692S; R494D, Y673S, V692S^ R494A, Y673S, V692S, R494E, Q534H, Y673S, 
V692S; R494D, Q534Hr Y673S, V692S; R494A, Q534H, Y673S, V692S; E159A; 
S161A; SXS2A^ Itl63A, S165A, S166A; F162A; Iil63A, S165A, S1&6A; Y167F; 

10 Yli67A; R168A; Q173Af Q173A, E174A, N175A; N193Af R195A; R197A; N193A, 
E195A, R197A; I32A; D54Ar K52A, I)54A; H114A; H114A, E115A, D117A; 
E115A; DU7A; variants with combinations of any of the foregoing 
alterations; AK3 and/or AK4 variants con^rising any of the foregoing 
alterations; corresponding deltas -huHGF variants and non-human animal 

15 HGF variants. 

III. Construction of the HGF Variants 

Whereas any technique known in the art can be used to perform 
site -directed mutagenesis r e.g. as disclosed in Sambrook et al. 

20 [ Molecular Cloning: ^ T,ftHo ratorv Manual * second edition. Cold Spring 
Harbor Iiaboratory PresSj New York. (1989)], oligonucleotide -directed 
mutagenesis is the preferred method for preparing the HGF variants of 
this invention. This method, which is well known in the art [Adelman 
et al. DNA , 2:183 (1983), Sambrook et al. , Supra l , is particularly 

25 suitable for making substitution variants, it may also be used to 
conveniently prepare deletion and insertion variants . 

As will be appreciated, the site -specific mutagenesis technique 
typically eniploys a phage vector that exists in both a single- stranded 
and double -stranded form. Typical vectors useful in site-directed 

30 mutagenesis, include vectors such as the M13 phage, for example, as 
disclosed by Messing et al.. Third Cleveland Svmposium on 
Macromolecules and Recombinant: DNA . Editor A. Walton, Elsevier, 
Amsterdam (1981) - These phage auce readily commercially available and 
their use is generally well I^own to those skilled in the art. 

35 Alternatively, plasmid vectors that contain a single -stranded phage 
origin of replication (Veira et al., Meth . Enzvmol . , 153 : 3 (1987)) 
may be en^loyed to obtain single ^stranded DNA. 
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The oligonucleotides are readily synthesized using techniques 
well known in the art such as that described by Crea et al . ( Proc. 
Nat'l. A cad. Sci. PSA . 25:5765 [19783 ). 

The specific mutagenesis method followed in making the HGF 
5 variants of Exan5)le 1 was described by Kimkel et al . . Methods in 
Enzvmol . 154 367-382 (1987) . 

Mutants with more than one amino acid substituted may be 
generated in one of several ways. If the amino acids are located 
close together in the polypeptide chain, they may be mutated 

10 simultaneously using one oligonucleotide that codes for all of the 
desired amino acid substitutions. If however, the amino acids are 
located some distance from each other (separated by more than ten 
amino acids, for example) it is more difficult to generate a single 
oligonucleotide that encodes all of the desired changes. Instead, one 

15 of two alternative methods may be employed. In the first method, a 
separate oligonucleotide is generated for each amino acid to be 
substituted. The oligonucleotides are then annealed to the single - 
stranded template DNA simultaneously, and the second strand of DNA 
that is synthesized from the ten5)late will encode all of the desired 

20 amino acid stibst i tut ions . The alternative method involves two or more 
roimds of mutagenesis to produce the desired mutant. 

« 

Another method for making mutations in the DNA sequence encoding 
wild- type HGF or a variant molecule known in the art, involves 
cleaving the DNA sequence encoding the starting HGF molecule at the 

25 appropriate position by digestion with restriction enzymes, recovering 
the properly cleaved DNA, synthesizing an oligonucleotide encoding the 
desired amino acid sequence and flanking regions such as polylinkers 
with blunt ends (or, instead of polylinkers, digesting the synthetic 
oligonucleotide with the restriction enzymes also used to cleave the 

30 HGF encoding DNA, thereby creating cohesive termini) , and ligating the 
synthetic DNA into the remainder of the HGF encoding structural gene. 

PGR mutagenesis is also suitable for making the HGF variants of 
the present invention, for exanple, as described in U.S. Patent No. 
4,683,195 issued 28 July 1987 and in Current Protocol e in Moleeiilar- 

35 Biology, Ausubel et.al., eds . Greene Publishing Associates and Wiley- 
--Interscience, Volume 2, Chapter 15, 1991. while the following 
discussion refers to DNA, it is understood that the technique also 
find application with RNA. The PCR technique generally refers to the 
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following procedure. When small amounts o£ teznplatie DNA are used as 
Btiarting matierial in a PCR, primers that: differ slightly in sequence 
from the corresponding region in a ten^late DNA can be. used to 
generate relatively large quantities of a specific DNA fragment that 
5 differs from the teznplate sequence only at the positions where the 

primers differ from the template. For introduction of a mutation into 
a plasmid DKA., one of . the primers, is designed .to overlap the position 
of the mutation and to contain the mutation; the sequence of the other 
primer must be identical to a stretch of sequence of the opposite 

10 strand of the plasmid, but this sequence can be located aixywhere along 
the plasmid Dm. Xt is preferred, however, that the sequence of the 
second primer is located within 200 nucleotides from that of the 
first, such that in the end the entire an^lified region of DNA. bounded 
by the primers can be easily sequenced. PCR amplification using a 

15 primer pair like the one just described results in a population of DNA. 
fragments that differ at the position of the mutation specified by the 
primer, and possibly at other positions, as template copying is 
somewhat error-prone. Xf the ratio of ten^late- to product material is 
extremely low, the vast majority of product DHA fragments incorporate 

20 the desired mutation (s) . This product material is used to replace the 
corresponding region in the plasmid that served as PCR teznplate using 
standard pN2V technology. Mutations at separate positions can be 
introduced simultaneously by either using a mutant second primer or 
performing a second PCR with different mutant primers and ligating the 

25 two resulting PCR fragments simultaneously to the vector fragment in a 
three (or more) -part ligation. 

The cDNA. encoding the HGF variants of the present invention is 
inserted into a replicable vector for further cloning or expression. 
Suitable vectors are prepeired using standard recombinant DNA 

30 procedures. Isolated plasmids and DNA, fragments are cleaved, 

tailored, and ligated together in a specific order to generate the 
desired ' vectors . 

After ligation, the vector with the foreign gene now inserted is 
transformed into a suitable host cell. The transformed cells are 
35 selected by growth on an suitibiotic, commonly tetracycline (tet) or 
^an^icillin (an^) , to which they are rendered resistant 'due to the 
presence of tet and/or amp resistance genes on the vector. Xf the 
ligation mixture has been transformed into a eukaryotic host cell. 
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transformed cells may be selected by the DHFR/mx system. The 
transformed cells are grown in culture and the plasmid dna (plasmid 
refers to the vector ligated to the foreign gene of interest) is then 
isolated. This plasmid DNA is then analyzed by restriction mapping 
5 and/or DNA sequencing. DNA sequencing is generally performed by 

either the method of Messing et al.. Nucleic Acids Res , . 2.= 309 (1981) 
or by the method of Maxam et al.. Methods of Enzvmoloav . 65;499 
(1980) . 

Prokaryotes are the preferred host cells for the initial cloning 
10 steps of this invention. They are particularly useful for rapid 

production of large amounts of DNA, for production of single -stranded 
DNA templates used for site-directed mutagenesis, for screening many 
mutants simultaneously, and for DNA sequencing of the mutants 
generated. For e3q>ressing the HGF variants of the present invention 
15 eu)caryotic hosts, such as eu)caryotic micro]3es (yeast) and 

multicellular organisms (mammalian cell cultures) may also be used. 
Exair5>les of prokaryotes, e.g. E. coli . eukaryotic microorganisms and 
multicellular cell cultures, and expression vectors, suitable for use 
in producing the HGF variants of the present invention are, for 
20 example, those disclosed in WO 90/02798 (published 22 March 1990) . 

Cloning and expression methodologies are well known in the art 
and are, for example, disclosed in the foregoing published PCT patent 
application (WO 90/02798) . 

If mammalian cells are used as host cells, transfection generally 
25 is carried out by the calcium phosphate precipitation method as 
described by GrcOiam and Van der Eb, Viroloav . 52 : 546 (1978) . 
However, other methods for introducing DNA into cells such as nuclear 
injection, elect roporat ion, or protoplast fusion are also suitably 
used. 

If yeast are used as the host, transfection is generally 
accomplished using polyethylene glycol, as taught by Hinnen, Proc. 
Natl. Acad. Sei. U-S^A.. 75 : 1929-1933 (1978). 

If prokaryotic cells or cells that contain substantial cell wall 
constructions are used, the preferred method of transfection is 
35 calcium treatment using calcium as described by Cohen et al., Proc. 
Natl. Acad. Sci. (USA) 69 : 2110 (1972) , or more recently 
electroporation . 
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The HGF variant preferably is recovered from the culture medium 
as a secreted protein, although it also may be recovered from host 
cell lysates when directly eaqpressed without a secretory signal. When 
the variant is expressed in a recombinant cell other than one of human 
5 origin, the variant is thus completely free of proteins of human 
origin. However, it is necessary to purify the variant from 
recombinant cell proteins in order to obtain preparations that are 
substantially homogeneous as to protein. As a first step^ the culture 
medium or lysate is centrifuged to remove particulate cell debris. 
10 . The variant . is then purified from contaminant soluble proteins , 

for example, by an appropriate combination of conventional 
chromatography methods, e.g. gel filtration, ion-exchange, hydrophobic 
interaction, affinity, immunoaf f inity chrcanatography, reverse phase 
HPLC; precipitation, e.g. ethanol precipitation, ammonium sulfate 
15 precipitation, or, preferaialy, immunpprecipitation with anti-HGF 

(polyclonal or monoclonal) antibodies covalently linked to Sepharose. 
Due to its high affinity to heparine, HGF can be conveniently purified 
on a hepeurin, such as heparine- Sepharose column. One skilled in the 
art will appreciate that purification methods suitable for native HGF 
20 may require modification to account for changes in the character of 
HGF or its variants ixpon expression in recombinant cell culture. 

As hereinabove described, huHGF contains four putative 
glycosylation sites, which are located at positions 294 and 402 of the 
a-chain and at positions 566 and 653 of the P- chain. These positions 
25 are conserved in the rat HGF amino acid sequence. Glycosylation 
variants are within the scope herein. 

Glycosylation of polypeptides is typically either N- linked or O- 
linked. . linked refers to the attachment of the carbohydrate moiety 
to the side -chain of an asparagine residue. The tripeptide sequences, 
30 asparagine -X- serine and asparagine -X- threonine, wherein X is any amino 
acid except proline, are recognition sequences for enzymatic 
attachment of the carbohydrate moiety to the asparagine side chain. 
0-linked glycosylation refers to the attachment of one of the sugars 
N- acetylgalactosamine, galactose, or xylose to a hydroxyamino acid, 
35 most commonly serine or threonine, although 5-hydroxyproline or 5- 
hydroxylysine may also be involved in O- linked glycosylation. O- 
linked glycoslation sites may, for exaznple, be modified by the 
addition of, or substitution by, one or more serine or threonine 
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residue to the amino acid sequence of the HGF molecule. For ease, 
changes are usually made at the DNA level, essentially using the 
techniques discussed hereinabove with respect to the amino acid 
sequence variants. 

5 Chemical or enzymatic co\:pling of glycosydes to the HGF variants 

of the present invention may also be used to modify or increase the 
number or profile of carbohydrate substituents . These procedures are 
advantageous in that they do not require production of the polypeptide 
that is. capable of O- linked (or N- linked) glycosylation . Depending on 

10 the coupling mode used, the sugar (s) may be attached to (a) arginine 

and histidine, (b) free carboxyl groups, (c) free hydroxyl groups such 
as those of cysteine, (d) free sulfhydryl groups such as those of 
serine, threonine, or hydroxyproline, (e) aromatic residues such as 
those of phenylalanine, tyrosine, or tryptophan or (f) the amide group 

15 of glutamine. These methods are described in WO 87/05330 (published 
11 September 1987) , and in J^lin and Wriston, CRC Crit. Rev. Bioehem, . 
pp. 259-306 (1981) . 

Carbohydrate moieties present on an HGF variant may also be 
removed chemically or enzymatically. Chemical deglycosylation 

20 requires exposure to trif luoromethanesulf onic acid or an equivalent 
compound. This treatment results in the cleavage of most or all 
sugars, except the linking sugar, while leaving the polypeptide 
intact. Chemical deglycosylation is described by Hakimuddin et al . . 
Arch. Biochem. Bionhvs , 259, 52 (1987) and by Edge et al . . Anal. 

25 Biochem. 118, 131 (1981) . Carbohydrate moieties can be removed by a 
variety of endo- and exoglycosidases as described by Thotakura et al . . 
Meth. Enzyipol, 138, 350 (1987) . Glycosylation is suppressed by 
tunicamycin as described by Duskin et al. . j. Biol. CYi^m 257 . 3105 
(1982) . Tunicamycin blocks the formation of protein-N-glycosydase 

30 linkages. 

Glycosylation variants of the amino acid sequence variants herein 
can also be produced by selecting appropriate host cells. Yeast, for 
example, introduce glycosylation which varies significantly from that 
of mammalian systems. Similarly, mammalian cells having a different 
35 species (e.g. hamster, murine, insect, porcine, bovine or ovine) or 
tissue (e.g. lung, liver, lyxnphoid, mesenchymal or epidermal) origin 
than the source of the selectin variant, are routinely screened for 
the ability to introduce variant glycosylation. Covalent 
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modifications of an HGP variant molecule are included within the scope 
herein. Such modifications are traditionally introduced by reacting 
targeted amino acid residues of the HGF veuciant with an organic 
derivatizing agent that is capable of reacting with selected side- 
5 chains or terminal residues, or by harnessing mechanisms of post- 

translational modifications that function in selected recombinant host 
cells. The resultant covalent derivatives are useful in programs 
directed at identifying residues iniportant for biological activity, 
for immunoassays of the HGF variants, or for the preparation of anti- 
10 HGF antibodies for immunoaffinity purification of the recombinant 

glycoprotein. For example, con^lete inactivation of the biological 
activity of the protein after reaction with ninhydrin would suggest 
that at. least one arginyl or lysyl residue is critical for its 
activity, whereafter the individual residues which were modified under 
15 the conditions selected are identified by isolation of a peptide 
fragment containing the modified amino acid residue. Such 
modifications are within the ordinary skill in the art and are 
performed without undue experimentation, 

Derivatization. with bifunctional agents is useful for preparing 
20 intramolecular aggregates of the HGF variants as well as for cross - 
linking the HGF variants to a water insoluble support matrix or 
surface for use in assays or affinity purification. In addition, a 
study of interchain cross- links will provide direct information on 
conformational structure. Commonly used cross -linking agents include 
25 i,l-bis (diazoacetyl) -2-phenylethane, glutaraldehyde, N- 

hydroxysuccinimide esters, homobifunctional imidoesters, and 
biftanctional maleimides. IJerivatizing agents such as methyl-3- [ (p- 
azidpphenyDdithiolpropioimidate yield photoactivatable intermediates 
which are capable of forming cross -links in the presence of light. 
30 Alternatively, reactive water insoluble matrices such as cyanogen 

bromide ' activated carbohydrates and the systems reactive substrates 
described in U.S. patent Nos, 3,959,642; 3,969,287; 3,691,016; 
4,195,128; 4,247,642; 4,229,537; 4,055,635; and 4,330,440 are employed 
for protein immobilization and cross -linking. 
35 Certain post-translational modifications are the result of the 

action of recombinant host cells on the expressed polypeptide. 

m 

Glutaminyl and aspariginyl residues are frequently post- 
translationally deamidated to the corresponding glutamyl and aspartyl 
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residues. Alternatively, these residues are deamidated under mildly 
acidic conditions. Either form of these residues falls within the 
scope of this invention. 

Other post-translational modifications include hydroxylation of 
5 proline and lysine, phosphorylation of hydroxyl groups of seryl or 
threonyl residues, methylation of the a-amino groups of lysine, 
arginine, and histidine side chains IT.E. Creighton, Proteins ; 
Structure and Molecular Pronerti^B . w.H, Freeman & Co., San Francisco, 
pp. 79-86 (1983)] . 

Other derivatives comprise the novel HGF variants of this 
invention covalently bonded to a nonproteinaceous polymer. The 
nonproteinaceous polymer ordinarily is a hydrophilic synthetic 
polymer, i.e. a polymer not otherwise found in nature. However, 
polymers which exist in nature and are produced by recombinant or in 
22»£rs methods are useful, as are polymers which are isolated from 
nature. Hydrophilic polyvinyl polymers fall within the scope of tbis 
invention, e.g. polyvinylalcohol and polyvinylpyrrolidone. 
Particularly useful are polyvinylalJtylene ethers such a polyethylene 
glycol, polypropylene glycol. 

The HGF variants may be linked to various nonproteinaceous 
polymers, such as polyethylene glycol, polypropylene glycol or 
polyoxyalkylenes, in the manner set forth in U.S. Patent Nob. 
4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

The HOP variants may be entrapped in microcapsules prepared, for 
example, by coacervation techniques or by interfacial polymerization, 
in colloidal drug delivery systems (e.g. liposomes, albumin 
microspheres, microemulsions, nano- particles and nanocapsules) , or in 
macroemulsions . Such techniques are disclosed in Remington' r 
Pharmaceutical Sciences, i 6th Edition, Osol. A., Ed. (1980). An 
HGF variant sequence can be linked to a immunoglobulin constant domain 
sequence as hereinbefore defined. The resultant molecules are 
commonly referred to as HGF variant -immunoglobulin chimeras. Such 
chimeras can be constructed essentially as described in WO 91/08298 
(published 13 June 1991) . 

Ordinarily, the HGF variant is fused C- terminally to the N- 
terminus of the constant region of an immunoglobulin in place of the 
variable region (s), however N- terminal fusions of the selectin 
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variants are also desirable- The transmembrane regions of the HGF 
variants are preferably inactivated or deleted prior to fusion. 

Typically, such fusions retain at least functionally active 
hinge ^ CH2 and CH3 domains of the constant region of an immunoglobulin 
5 heavy chain. Fusions .ue also made to the C- terminus of the Fc 

portion of a constant domain, or ixranediately N- terminal to the CHI of 
the heavy chain or the corresponding region of the light chain. This 
ordinarily is accomplished by constructing the appropriate DNA 
sequence and expressing it in recombinant cell culture* 
10 Alternatively, however, the HGF variant- immunoglobulin chimeras of 
this invention may be synthesized according to known methods . 

The precise site at which, the fusion is made is not critical; 
particular sites are well Imown and may be selected in order to 
optimize the biological activity, Becretiem or binding characteristics 
15 of the HGF variant. 

Xn'some embodiments, the hybrid immunoglobulins are assembled as 
monomers, or hetero- or homo-nnxltimersr and pearticu l arly as dimers of 
tetramers, essentially as illustrated in HO 91/0829^8, Supra. 

In a preferred embodiment, the C- terminus of a sequence which 
20 contains the binding siteCs) for an HGF receptor, is fused to the N- 
tezininuB of the C- terminal portion of . an antibody (in particular the 
Fc domain) , containing the effector functions of an immunoglobulin, 
e.g. immunoglobulin G^. It is possible to fuse the entire heavy chain 
constant region to the sequence containing the receptor binding 
25 site(s} . However, more preferably, a sequence beginning in the hinge 
region just upstream of the papain cleavage site (which defines IgG Fc 
chemically; residue 216, taking the first residue of heavy chain 
constant region to be 114 [Kobet et al. . Supra ] , or analogous sites of 
other immunoglobulins) is used in the fusion. In a particularly 
30 preferred embodiment, the amino acid sequence containing the receptor 
binding site (s) is fused to the hinge region and Cjj2 and 0^3 or Cjjl, 
hinge, Cjj2 and C|j3 domains of an IgG^^, IgG2 or IgG3 heavy chain. The 
precise. site at which the fusion is made is not critical, and the 
optimal site can be determined by routine experimentation. 
35 HGF variant- immunoglobulin chimeras may, for exaxnple, be used in 

protein A ptjrification, immunohistochemistry, and immunoprecipitation 
techniques in place of anti-HGF antibodies, and can facilitate 
screening of inhibitors of HGF -HGF receptor interactions. 
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Therapeutically, they are eaqjected to confer advantages such as longer 
half -life as con^ared to the corresponding HGF variant molecule. 

rv. Theraoeutie Compositions 
5 The HGF variants with enhanced receptor binding affinity can be 

used to block the binding of wild- type HGF to its receptor. This 
would permit the treatment of pathologic conditions associated with 
the activation of an HGF receptor, such as malignancies associated 
with chronic HGF receptor activation. 

coinpoxmds of the present invention can be formulated 
according to known methods to prepare pharmaceutically useful 
coxnpositions , whereby the HGF product is combined in admixture with a 
pharmaceutically acceptable carrier. Suitable carriers and their 
formulations are described in Remington' s Pharmaceutical Scienggg , 

15 16th ed., 1980, Mack Publishing Co., edited by Oslo et al . These 
coxnpositions will typically contain an effective amount of the HGF 
variant, for example, from on the order of about 0.5 to about -10 
mg/ml, together with a suitable amount of carrier to prepare 
pharmaceutically acceptable compositions suitable for effective 

20 administration to the patient. The variants may be administered 
parenterally or by other methods that ensure its delivery to the 
bloodstream in an effective form. 

Con^ositions particulaurly well suited for the clinical 
administration of the HGF variants used to practice this invention 

25 include. sterile aqueous solutions or sterile hydratable powders such 
as lyophilized protein. Typically, an appropriate amount of a 
pharmaceutically acceptable salt is also used in the formulation to 
render the formulation isotonic. 

Dosages and desired drug concentrations of pharmaceutical 

30 compositions of this invention may vary depending on the particular 

use envisioned. A typical effective dose in rat experiments is about 
250 A^Sr/kg administered as an intravenous bolus injection. 
Interspecies scaling of dosages can be performed in a manner known in 
the art, e.g. as disclosed in Mordenti et al . . Pharmacetit . Pi^^ ,, s, 

35 1351 (1991) and in the references cited therein. 

The following examples merely illustrate the best mode now 
contemplated for practicing the invention, but should not be construed 
to limit the invention. 
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m 

V. Examples 

A series of recombinanli huHGF (rhuHGF) variants were produced to 
determine the structural and functional importance of the cleavage of 
the prohormone to the a/P dimer and of the Kringle and protease -like 
5 domains. Mutations were introduced into the huHGF cDNA in a CMV 

based egression plsismid and conditioned media from stable populations 
of human 293 cells expressing each variant were assayed by Western 
blotting to monitor the size and e3cpre88ion level of the HGF variants. 
The concentration of each huHGF derivative was confirmed with two 
X(X types of sandwich ELISA assays - The differences in expression levels 
found in EIjISA correlated with those observed on Western blots. For 
most variants^ the level of e3q>ression was in the range of 1-5 mg/mli. 
For variants with eaqjression levels below 0.6 mg/mL, the conditioned 
media was concentrated. 
15 The mitogenic activity on liver cells in primary culture and 

ability to bind to the HGF receptor was then determined. The extra- 
cellulsur domain of the HGF receptor was fused to the constant region 
CFc) of 'an hiunan XgG and binding was performed in solution. 

The construction of the rhuHGF variants, the assay methods and 
20 the analysis of the results obtained with the various mutants are 
described in the following examples. 

EXAMPLE 1 

Recombinant Production of the huHGF Variants 
25 A, Site -directed mutagenesis 

Plasmid DNA isolation, polyacrylamide and agarose gel 
electrophoresis were performed as disclosed in Sambrook et al . . supra . 

Mammalian eacpression plasmid pRK 5.1 with a CMV promoter 
(Genentech, Inc.) was used for mutagenesis of huHGF allowing secretion 
30 of the HGF variants in. the cultiire medium and directly assayed for 
biological activity and binding. This expression vector is a 
derivative of pRK5, the construction of which is disclosed in EP 
307,247 published 15 March 1989. The nucleotide sequence encoding 
this the pRK 5.1 vector is shown in Pigure 5 (SEQ. ID. NO: 1) . 
35 The huHGF dDNA used corresponds to. the 726 amino acid form as 

published earlier (Miya^awa et al. . 1989, supra ) . 

l&itagenesis was performed according to the method of Kunkel using 
the commercially available dut- ung- strain of E . coli [Kimkel et 
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al*.# KlP^hod. Enzypiol. 154, 367-382 (1987)3. Synthetic 
oligonucleotides used for in vitro mutagenesis and sequencing primers 
were prepared using the implied Biosystem 380A DNA synthesizer as 
described [Matteucci et al . . J. Am. Chem. Soc. 103 , 3185-3191 (1981)]. 
5 For generation of. the desired mutants, oligonucleotides of sequences 
coding for the desired amino acid substitutions were synthesized and 
used as primers. The oligonucleotides were annealed to single - 
stranded pRKSl-huHSA that had been prepared by standard procedures 
[Viera et al . . Method. Enzvmol . 142 . 3 (1987)] . 

A mixture of three deoxyribonucleotides , deoxyriboadenosine 
(dATP) , • deoxyriboguanosine (dQTP) , and deoxyribo thymidine (dTTP) , was 
combined with a modified thio-deoxyribonuleosine called dCTP(aS) 
provided in the kit by the manufacturer, and added to the single 
stranded pRK 5.1-huHGP to which was annealed the oligonucleotide. 

Upon addition of DNA polymerase to this mixtxire, a strand of DNA 
identical to pRK 5.1-huHGF except for the mutated bases was generated. 

* 

In addition, this new strand of DNA contained dCTP(aS) instead of 
dCTP, which served to protect from restriction endonuclease digestion. 
After the template strand of the double -stranded heteroduplex was 
nicked with an appropriate restriction enzyme, the template strand was 
digested with lacolll nuclease past the region that contained the 
mutagenic oligomer. The reaction was then stopped to leave a molecule 
hat was only partly single -stranded. A complete double -stranded DNA 
homoduplex molecule was then formed by DNA polymerase in the presence 
of all four deoxyribonucleotide triphosphates, ATP, and DNA ligase. 

The following oligonucleotides were prepared to use as primers to 
generate pRK 5.1-huHGP variant molecules: 

R494E huHGF: TTGGAATCCCATTTACAACCTCGAGTTGTTTCGTTTTGGCACAAGAT 

(SEQ. ID. NO: 2) 

30 R494D huHGF: GAATCCCATTTACGACGTCCAATTGTTTCG (SEQ. ID. NO: 3) 
R494A huHGF: CCCATTTACAACTGCCAATrGTrTCG 
Q534H huHGF: AGAAGGGAAACAGTGTCGTGCA 
Y673S huHGF: AGTGGGCCACCAGAATCCCCCT 
V692S huHGF: TCCACGACCAGGAGAAATGACAC 
35 AKI huHGF: GCATTCAACITCTGAGTTrCTAATGTAGTC 

CATAGTATTGTCAGCTTCAACTTCTGAACA 
TCCATGTGACATATCTTCAGTTGTTTCCAA 
TGTGGTATCACCTTCATCTTGTCCATGTGA 
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N-a03 huHGF: AGCTTGGATGCATTAAG'TrGTTTC (SEQ. ID. 110:12) 

N-384 huHGF: TTGTCCATGTGaTTAATCACAGT (SEQ. ID. NOil3) 

a-chain: GTTCGTGTrGGGATCCCATTTACCTATCGCAATrG (SEQ. ID. N0:14) 

m 

The Y673S, V692S huHGF variant was obtained from wild-type huHGF 
5 as a teii5>late, using both oligonucleotides used for generating the two 
nucations . 

The mutant huHGF constructs generated using the protocol above 
were transformed in E. coli host strain MM294tonA using the standard 
calcium chloride procedure (Sambrook ^^l.^., supra) for preparation 
10 and transformation of competent - cells . Mfl294tonA (which is resistant 
to Tl phage) was prepared by the insertion and subsequent iinprecise 
excision of a TnlO transposon into the traA gene. This gene was then 
inserted, using transposon insertion mutagenesis [Kleclcner et al, , 
MQl- Biol. 116 > 125-159 C1977)1, into E^^SSli bOSt MM294 (ATCC 

15 31,446) . 

The DNA extract from individual colonies of bacterial 
transf ormants using the standard miniprep procedure of Sambroolc gt 
al. . suora . The plasmids were further purified by passage over a 
Sephacryi CI*6B spin column, and then analyzed by sequencing and by 
20 restriction endonuclease digestion and agarose gel electrophoresis. 
B. Transf ection of Human Embryonic Kidney 293 Cells 

Plasmids with the correct sequence were used to transfect human 
fetal kidney 293 cells by the calcium phosphate method. 293 cells were 
growth to 70% confluence in 6-well plates. 2,5 ng of huHGF plasmid 
25 DNA variant was dissolved in 150 fil ot 1 nM Tris-HCl, 0.1 mM EDTA, 

0.227 M.CaClj. Added to this (dropwise while, vortexing) was 150 ^1 of 
50 mM HEPES buffer (pH 7.35) , 280 mM NaCl, 1.5 mM NaPO^, and the 
precipitate was allowed to form for ten minutes at 25 *>C. The 
suspended precipitate was then added to the cells in the individual 
30 wells in a 6-well plate. The qell monolayers were incubated for 4 
hoiirs in the presence of the DNA precipitate, washed once with PBS, 
and cultured in serum- free medium for 72h. When stable populations 
were made, the HGF cDNA was subcloned in an episomal CMV driven 
expression plasmid pCisEBON (G. Cachianes, C, Ho. R. Weber, S. 
35 Williams ,^D. Goeddel, and D. Lueng, in preparation) . pCisEBON is a 

pRK5 derivative? its underlying nucleotide sequence is shown in Figure 
6 {SEQ. ID. NO: 15). The populations were directly selected in 
Neomycin selective medixmt. 
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EXAMPLE 2 
Assay Methods 

In view of the pleiotropic activities of HGF, a molecule with a 
structure unlike any other known growth factor, it is important to 
5 understand the molecular interaction of this factor with its receptor. 
The huHGF variants produced as described in Exan?>le 1 were analyzed 
for their ability to induce DNA synthesis of hepatocytes in primary 
culture and to con^jete for binding to a solxible form of the huHGF 
receptor . 

10 A. Protein quantification of wild- type huHGP and huHGF variants, 

A specific two- site huHGF sandwich ELISA using two monoclonal 
antibodies was used to quantify wild- type recombinant huHGF (WT 
rhuHGF) , single chain and protease substitution variants . Microtiter 
plates (Maxisorb, Nunc) were coated with 10 mg/ml of a monoclonal 

15 anti-rhuHGF antibody A 3.1.2 (IgG2a phenotype, affinity: 3.2 x 10 
mol) in 50 MM Carbonate buffer, pH 9.6, overnight at 4«C. After 
blocking plates with 0.5 % BSA (Sigma), 0.01 V thimerosal in PBS, pH 
7.4, and subsequent washes, duplicate serial dilutions of HGF samples 
were prepared and in parallel a C310- expressed rhiiHGF (40-0.1 ng/mL) 

20 was used as a standard. Fifty microliters of these dilutions were 
simultaneously incubated with 50 mL of a 1:1500 diluted horseradish 
peroxidase conjugated monoclonal anti-rhuHGF antibody B 4.3 (IgGl 
phenotype, affinity: 1.3 x 10 mol) for 2 h at RT. The substrate was 
prepared by adding 0.04 % o-phenylenediamine-dihydrochloride (Sigma) 

25 and 0.0i2 If (v/v) hydrogen -peroxide (Sigma) to PBS and 100 ml were 
added to the washed plates for 15 minutes at RT. The reaction was 
stopped by adding 50 mL of 2.25 M sulfuric acid to each well. The 
absorbance at 490 nm, with the absorbance at 405 nm subtracted as 
background, was determined on a microtiter plate reader (Vmax, 

30 Molecular Devices, Menlo Park, CA) . The data was reduced using a 
four-parameter curve- fitting program developed at Genentech, Inc. 

An HGF polyclonal sandwich ELISA was used to quantify all kringle 
deletion and C- terminal truncation variants. Briefly, microtiter 
plates (Nunc) were coated with 5 mg/mL guinea pig polyclonal (anti 

35 CHO-expressed rhuHGF) IgG antibody preparation (Genentech, Inc.) as 
described above. This antibody recognizes rhuHGF as well as HGF 
truncated forms when compared to visual inspection of Western blots 
making it ideal for monitoring HGF variants. Plates were blocked and 
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duplicate serial dilutions. Of 293 cell supernatants (1:103-6.106) were 
added and inciibated over nigiit. at 4*»C. Purified CHO-expressed rhuHGF 
tlOO-0.78 ng/mli) wa& used as a standard and incubated in parallel. 
Plates were washed anci incubated with a IrBOO dilution of the same 
5 polyclonal antibody (approx. 400 ng/mL) but in this case horseradish 
peroxidase conjugated for detection of the variants (see eibove) . 
Western blotting was performed to determine the size of the expressed 
HGF variants, .For this, SDS-polyacrylamide gel electrophoresis and 
Western blotting were performed using standard methods with the 
10 polyclonal IgG antibody preparation (500 ng/mL) - A chemiliuninescent 

detection method (Amersham) and a goat anti -guinea pig IgG- horseradish 
peroxidase conjugate (1:5000) were used for development of the blot as 
described by the manufacturer. 
B. Soluble HGF receptor binding assay. 
X5 Previous studies on HGF binding to hepatocytes have shown that 

huHGF could bind to its cell surface receptor with high affinity 
(Kd-24-32 pM; Kiguchi and Nakamura^ Biochem. Biophv s. Res. Comm. 174. 
831-838 (1991)) . We preferred to examine HGF binding using a soluble 
form of the receptor because of the nonspecific binding of HGF to cell 
20 surface heparin sulfate proteoglycans [Naldini et al . . EMBO J. lO, 
2867-2878 (1991)1- 

Ceil supernatants (concentrated on Amicon filters if 
concentration was below 600 ng/mL) were tested for their ability to 
block in solution the binding of CHO- expressed ^^Sj rhuHGF (2-5 x 103 
25 Ci/mmoler kindly provided by T. Zioncheck, Genentech, Inc.) to the 

extracellular domain of the human HGF receptor (hxaHGFr) fused to the 
Fc constant region of an human IgG, expressed and secreted from 293 
cells . 

1,. Construction of huHGFr-IgG chimeras. 

30 A full length cDNA clone encoding the huHGFr was constructed by 

joining partial CDNAs isolated from cDNA libraries and from PCR 
amplification. Coding sequences for amino acids 1-270 were isolated 
from a human placental cDNA library (provided by T. Mason, Genentech) 
screened with a 50 mer oligonucleotide (5' - 

35 ATGAAGGeCCCCGCTGTGCTTGCACCTGGCATCCrCGTGC:TCCTGTTTACC-3' ) (SEQ. ID. NO: 
16) - Sequences encoding amino acids 809-1390 were isolated from a 
human liver library (Stragagen) screened with the oligonucleotide 
probe 
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(5' -CACTAGTTJUKSATGGGGGACATGTCTGTOWSAGGATA CCGT-3' ) . 

(SEQ. ID. NO: 17) 

Conditions for plating libraries, and for hybridization and 
washing filters were as described (Godowski et al . . Proc. Natl. Acad. 
5 .Sci. USA 8083-8087 <i989)j . PGR was used to isolate a cDNA clone 
containing residues 271-808 of the HGFr ( c-met ) from A549 cells. Ten 
figs of total RNA was used for reverse transcription using a primer 
specific to the HGFr (5 ' -TAGTACTAGCACTATGATGTCT -3') (SEQ. ID. NO: 18) 
in a 100 /il reaction using Moloney murine leu)cemia virus reverse 
10 transcriptase and buffers supplied by Bethesda Research Laboratories. 
One-tenth of this reaction mixture was used for PGR an5>lif ication . 
The PGR reaction was performed in a volume of 100 nl containing 10 fil 
of the reverse transcriptase reaction, 10 mM KCl, 20 mM Tris-HCl (pH 
8.8), 10 mM <NH4)S04, 6 mM MgS04, 0.1* Trition X-100, 1 D of Vent DNA 
15 polymerase (New England Biolabs) and 50 pmol each of the forward 

primer (5' -TTTACTTCTTGACGGTCCAAAG-3 ' (SEQ. ID. NO: 19) and the reverse 
primer (5' -CAGGGGGAGTTGCaiGATTCAGCTGT-S' ) (SEQ. ID. NO: 20). After 
thirty cycles of denaturation (95*>C, 1 min) , annealing (55*0, 45 sees)- 
and extension (72»C, 2 min) , the PGR product were recovered from low- 
20 melting ten5>erature agarose gels. The full-length HGFr cDNA was 

subcloned into vector pRK7 (see WO 90/02798, published 22 March 1990) 
and double -stranded DNA sequencing was performed by the 
dideoxynucleotide method. 

The coding sequence of the extracellular domain of the hxiHGPr was 
25 fused to those of the human IgGl heavy chain in a two-step process. 
PGR was used to generate a fragment with a unique BstElI site 3' to 
the coding sequences of the HGFr amino acid 929. The 5' primer 
(located in the vector upstream of the HGFr coding sequences) and the 
3' primer (5' -AGTTTTGTCGGTGACCTGATCATTCTGATCTGGTTGAACTATrAC-3 ' ) (SEQ. 
ID. NO: 21) were used in a 100 fil reaction as described above except 
that the extension time at 72oc was 3 minutes, and 40 ng of the full 
length HGFr esqpression vector was used as template. Following 
axi^lification, the PGR product was joined to the human IgG-yl heavy 
chain cDNA through a unique BstEII site in that construct [Bennett et 
^.^ J- Biol. Chem. 266, 23060-23067 (1991)]. The resulting construct 

contained the coding sequences of amino acids 1-929 of the huHGFr 
fused via the BstEII site (adding the coding sequences for amino acids 
V and T) to the coding sequences of amino acids 216-443 of the human 



30 



33 



wo 93/23541 PCr/ljS93/04648 

IgG-yl heavy chain. Sequencing of the construct was carried out as 
described above. 

2.* Binding assay- 

The binding assay was performed in breakable microtiter plates 
5 (Niuic) coated o/n at 4«C with 1 mg/mL of rabbit-anti-human IgG Fc 

specific antibody (Jackson Immunoresearch) and plates were carefully 
washed with PBS containing 0.05% Tween 20 (Biorad) . After blocking 
with PBS containing 0.1% BSl, in this same buffer, SOpM of 125I-rh\aHGF 
in 25 mil per well were added. To each well 50 ml* of serial dilutions 
10 (1:25-1:6000) of cell supematants, purified CHO-expressed rhuHGF 

(25,000-0-064 pM) or medium were added in dxsplicates. Subsequently, 
25 mL of 50 pM of HGF receptor: IgG fusion protein were added and the 
plates were incubated with gentle shaking. After 4 hours, when 
equilibrium was reached, plates were washed and wells were 
15 individually coxinted in a gamma- counter- The amount of 

nonspecif ically bound radioactivity was estimated by incubating HGP 
receptor :XgG with a SOO-fold excess of unlabelled rhuHGF. The 
dissociation constant (Kd) of each analogue was calculated at the IC50~ 
from fitted inhibition curves essentially as described (DeBlasi et 
20 aJ.., 1989 [?] ) using the huHGF concentration determined by ELISA. 
C. Biological assay. 

The biological activity of WT huHGF and variants was measured by 
their abilities to induce DNA synthesis of rat hepatocytes in primary 
culture. Hepatocytes were isolated according to published perfusion 
25 techniques with minor modifications [Garrison and Haynes, J . Biol . 

Chem. 150 . 2269-277 (1975) K Briefly, the livers of female Sprague 
Dawley rats (160-I80g) were perfused through the portal vein with 100 
mil of Ca^ free Hepes buffered saline containing 0.02% Collagenase 
type IV (Sigma) . After 20 minutes the liver was removed, placed in 
30 buffer, gently stirred to separate hepatocytes from connective tissue 
and blood vessels, and filtered through nylon mesh. Cells were then 
washed by centrifugation, resuspended at 1x10^ cells/mli in Williams 
Media E (Gibco) containing Penicillin (100 U/ml) , Streptomycin (100 
mg/mli) , L-Glutamine (2mM) , trace elements (0.01%), transferrin (10 
35 mg/mL) and i^rotinin (1 mg/xnL) . Hepatocytes were incubated in 96 -well 
microtiter plates (Falcon) in the presence of duplicate serial 
dilutions of either purified CHO-es^ressed rhuHGF (1-0.031 mg/mL), 293 
supematants (1:4-1:256) or medium. After 48 hours incubation at 

34 
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37«»C, 0.5 mCi 3H*TdR (15 Ci/mmole, Amersham) was added to each well 
and incubated for an additional 16 hoxirs. Cells were harvested on 
filter papers, which were washed, dried and counted in a Beckman 
counter after addition of scintillation liquid. For each huHGF 
5 variant, the specific activity (SA) eacpressed in units/mg was 

calculated at half -maximal proliferation (defined as 1 unit/mL) using 
the H6F concentration obtained in ELISA. 

D. Induction ot tyrosine phosphorylations on AS49 cells. 

Human lung carcinoma cells (A549) monolayers were cultured in 

10 RPMI 1640 medium containing 10% fetal bovine serum and maintained at 

37«C in a humidified atmosphere with 5% 003. Serum- starved cells were 
incubated without or with 200 ng/mL rhuHGF for 5 minutes at 37 ®C and 
extracted with lysis buffer containing 50 mM Hepes, 150 mM NaCl, 1.5 
mM MgClj, 1 xnM BGTA, 10 % Glycerol, 1 % Triton X-100 and a cocktail 

15 of protease inhibitors. The lysates were immtanpprecipitated with 
anti-Met COOH antibodies and blotted with anti-phosphotyrosine 
antibodies (see Western blotting above) . 



EXAMPLE 3 

20 Analysi s of Cleavage Site Mutantg 

The cleavage site of proteases commonly contains a basic residue 
at position Pi and two hydrophobic amino acid resides in positions P'l 
and P'2, which follow the cleaved peptide bond. The proposed cleavage 
site of huHGF (PI R494, P'l V495, P'2 V496) fits this consensus. We 

25 chose to try to block cleavage of huHGF by replacing the Pi R494 with 
either D, E, or A. The major form of WT rhuHGF expressed in these 
cells is cleaved into two- chain material as judged by the presence of 
the Of- chain with an apparent molecular mass of 69 kDa (Fig. 2) . Each 
of these mutations appeared to block processing of rhuHGF because 

30 under reducing conditions these variants migrated as a single band at 
94 kDa, the predicted size of single-chain HGF. These variants 
totally lacked the ability to induce the proliferation of hepatocytes 
in primary culture (Pig. 3A) . However, when these variants were 
analyzed for their ability to contpete with WT rhuHGF for binding to 

35 the HGF receptor :lgG fusion protein, their inhibition curves were 
roughly similar to that of WT rhuHGF (Fig. 3B) . The Kd determined 
from these curves showed that WT rhuHGF binds to the fusion protein 
with high affinity (50-70pM) whereas all single chain variants showed 
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approximately a 2- to 10-fold higher Kd (100-500pM) compared to WT 
rhuHGF. Results from at least three independent assays are siunmarized 
in Table I as residual hepatocyte proliferative activity 6Uid receptor 
binding Capacity compared to WT rhuHGF. 
5 Our binding studies showed that nr rhuKGF bound to the soluble 

receptor fusion protein with a single class of high affinity binding, 
sites (50-70 pU) , similar to those found on hepatocytes by Higushi and 
Nalcamura (1991) . However, binding of HGF on cells may slightly be 
different since the soluble receptor is actually a dimer held together 

10 by the disulfide bridge of the hinge in the Fc portion of the IgGA. 

Direct conqoarispn of specific activity (SA) versus Kd ratios of 
all single chain variants showed they were inactive at the highest 
coxicentration. tested (SA< 3%) while receptor binding affinities were 
only decreased by a factor of 2-3. 

15 These results argue strongly that cleavage of HGF into the two- 

chain form is. required for mitogenic activity, i.e. that single- chain 
HGF is a promitogen and that the uncleaved form of HGF binds to the 
HGF receptor r albeit with a reduced affinity. 

The major form of HGF isolated from placenta [Hernandez et al . , 

20 (1992) J. Cell Phvsiol. . in press] or expressed in transfected COS 

cells [Rubin et al. , Proc. Natl. Acad. Sci. USA 88 , 415-419 (1991)3 is 
in single-chain form. When tested in mitogenic assays, this single- 
chain form of HGF is found to be biologically active. Taken together 
with our data, this suggests that this single- chain HGF is activated 

25 to the two -chain forai during the mitogenic assay. 

A second observation is that single -chain variemts retain 
substantial capacity to bind to the HGF receptor, as suggested by our 
competition binding assays. This raises the interesting possibility 

ft. 

that single -chain HGF may be bound to cell- surface HGF receptor in 
30 vivo in an inactive state and can subsequently be cleaved to the 
active double -chain form by the appropriate protease. 

EXAMPIiB 4 

The Effects of Protease Domain Mutations 
35 To elucidate the functional importance of the protease domain of 

HGF, several single, double, and triple mutations were made in order to 
reconstitute a potential, serine -protease active site. The 
construction of these variants is described in Example 1. 
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We replaced HGF residues Q534 with H, Y673 with S, or V692 with S 
as either single, double or triple mutations. The analysis of their 
effects on mitogenic activity and receptor binding showed that the 
single mutation Q534H did not significantly alter either SA (5.2 x 104 
5 Units/mg) or Kd (60 pM) when compared to wt rhuHGF (respectively 3.3 
104 Units/mg and 70 pM) whereas Y673S and V692S exhibited SA reduced 
approximately 5- and 10-fold, respectively. In fact, these two 
variants never reached the maximum plateau seen with WT rhuHGF 
(approximately 50 % of wt rhuHGF plateau) . Interestingly, these 
10 variants showed a Kd similar to WT rhuHGF. All other double and 

triple variants also retained, the ability to bind the HGF receptor but 
they clearly showed a reduced SA (Table I) . The residual SA of the 
double variants Q534H,Y673S and Y673S,V692S and of the triple variant 
Q534H,Y673S,V692S were less than 3 % compared to WT rhuHGF. However, 
the Kd of these variants was not significantly different from WT 
rhuHGF (Table I) . These variants indicate that mutations within the 
P- chain of HGF block mitogenic activity but they are still able to 
bind to the HGF receptor. Thus, it appears that these mutants are 
defective in an activity siibsequent to receptor binding. 

These results show that although the P-chain is not required for 
receptor binding, certain residues (e.g. Y673 and V692) are critical 
for the structure and/or activity of HGF. Substitution of the 
nonpolar residue V692 with the polar residue S might have caused a 
structural transition if new hydrogen bonds to the active site residue 
25 D594, as found in serine -proteases, have been introduced. 

Substitution of Y673 with the smaller residue S might also introduce 
some loqal structural modifications- On the other hand, replacement 
of the polar residue Q534 by another polar residue H of similar size 
would not likely cause a drastic difference in the HGF conformation as 
30 this residue should be eaqposed; indeed the Q534H variant was similar 
to rhuHGF (Table I) . 

EXAMPLE 5 

The Effect of ter minal and Krinole DeletioTiR 
35 In order to ascertain whether the a- chain is required for HGF 

binding or activity, C- terminal truncations were made as described in 
Example 1, resulting in variants containing either the a- chain alone. 
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or variants truncated after the third (N-384) or second (N-303} 
Kringles . 

A nuniber of C- terminal truncations of HGF were made by deleting 
either the chain or the p.- chain in addition to a progressive number 
5 of kringles as depicted in Fig. 1« One variant (N-207) corresponding 
to the N- terminal domain with the first Kringle did not escpress the 
protein to levels detectable either by Western blotting or ELXSA using 
a polyclonal antibody preparation and thus was not investigated 
further. Eacpression of the variants containing the first two Kringles 

IQ (N*303) , three Kringles (N-384> or the coznplete a-chain of HGF was as 
low as 250- &00 ng/mL. A summary of the residual SA euid Kd con^ared to 
WT rhuHGF of these variants is presented in Table X. At the 
concentration tested no activity above backgroiind levels was observed 
indicating that these variants lost their biological activity. 

15 However, binding coinpetition showed that variants N-303, N-384 or the 
a- chain still retained substantial binding capacity (up to 23 V 
conpared to WT rhuHGF binding) . Thus, the N- terminal 272 residues o£ 
HGF (the mature form of variant N-303) are sufficient for high 
affinity binding to the HGF receptor. 

20 Results from deleting each kringle domain are shown in Table 1. 

Deletion of the first Kringle (variant AKl) "bf HGF affected biological 
activity most, showing at least a 100 -fold reduction (SA< 0.2% of wt 
rhuHGF) . Similarly, binding of this variant was also affected as it 
failed to con^jete for binding with wt rhuHGF up to 2 mg/ml*. Deletion 

25 of all other Kringles (variants AK2, AK3 or AK4) also induces severely 
reduced mitogenic activity (Table I) . However, the Kds of these 
deletion variants remained close to that observed with wt rhuHGF. 

These data show that Kringles K3 and K4 are not required for 
receptor binding. Our data support the previous observations by 

30 Miyazawa et al . . 1991 suora and Chan et al . , 1991 supra , in the sense 
that vajciant N-303, which in amino acid sequence is very similar to 
HGF/NK2, retains the ability to compete efficiently for bindings to 
the HGF receptor {Kd*2ao pM) . Furthermore, the observations that N-303 
is sufficient to bind to the receptor and that the second Kringle is 

35 not required for binding the HGF receptor (in the context of the 

remainder of the molecule) suggest that the receptor binding domain is 
contained within the finger and first Kringle of hioHGF. 
Unfortunately, we have not been able to detect expression of this 
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m 

variant using our polyclonal antisera suggesting that variant N-207 
(deletion after the first kringle) was not expressed in 293 cells. 

EXAMPLE 6 

^ Induction of Tvxosine-PhQ SPhorvlation of the huHGF Receptor 

We determined if variants R494E or Y673S,V692S, which bind the 
HGF receptor in vitro but are defective for mitogenic activity, could 
stimulate tyrosine -phosphorylation of the HGF receptor in A549 cells. 
Serum starved cells were treated with purified WT rhxaHGP or variants 
10 and immunoprecipitates of the HGF receptor were blotted and probed 

with phosphotyrosine antibodies. Stimulation with wt rhuHGF led to the 
phosphorylation on tyrosine of the 145 kDa p-subimit of the HGF 
receptor (Pig. 4) . Both variants exhibited a reduced ability to 
induce phosphorylation of the HGF receptor. 

Stimulation of tyrosine phosphorylation on the HGF receptor p- 
subunit by HGF was previously reported [Bottaro et al . . Science 251 , 
802-804 (1991), Naldini et al , , 1991 supra]. The present data show 
that variants R494E and Y673S,V692S can bind the soluble HGF receptor: 
IgG protein in vitro but are not efficient in stimulating tyrosine- 
20 phosphorylation in A549 cells. One interpretation of this result is 
that these variants are capable of binding the HGF receptor on A549 
cells, but are defective in a function required to induce efficient 
phosphorylation, e.g. receptor dimerization . it has been shown for 
other receptor proteins with an intrinsic tyrosine kinase such as the 
25 epithelial and pi ate let -derived growth factor that receptor -receptor 
interactions or dimerization is required for activation of kinase 
function [see for review Dlrich and Schlessinger, Cell 61 203-212 
(1990)] . Alternatively, these variants may not be able to bind the 
cell -surface associated HGF receptor. 
30 The unique structure of HGF suggests that there may be multiple 

events that regulate the biological activity of this molecule. An 
early stage of regulation may be the cleavage step to generate the 
biologically active two-chain form. Interestingly, cleavage may not 
singly regulate receptor binding but rather control a siibsequent event 
.35 required for activating the HGF receptor. Our data also suggest that 
the p- chain, while not absolutely required for receptor binding 
contributes to a receptor activation step. These variants may be 
useful in dissecting the signalling events at the HGF receptor. 
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■ EXAMPLE 7 

Hairpin. Dctmain emd Krinale 1 Domain Variants 
The huHGF variants listed in Tables 2 and 3 were generated, and 
their specific activities (SA) and Kd ratios were detexxnined 
5 essentialXy as described in the foregoing exanples. 



Although the foregoing xefers to particulsor preferred 
embodiments, it will be understood that the present invention is not 
so limited. It will occiur to those ordinarily skilled in the art that 
10 various modifications may be made to the disclosed embodiments without 
diverting from the overall concept of the invention.. All such 
modifications are intended to be within the scope of the present 
invention , 
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Table 1 



Variants (var) 



SA var/SA wt 



Kdwt/Kdvar 











S.D. 






S.D. 


> 




Single -chain 
















R494A 


<0.03 




0.32 


+/- 


0.18 






R494D 


<0.03 




0.51 


+/- 


0.21 






R494E 


<U . 




0.31 


+/- 


0 .13 






Psrotease 
















Q534H 


1.19 +/- 


0.44 


1.48 


+ /- 


0.85 






Xo /^S 


0.27 +/- 


0.07* 


1-35 


+/- 


0.72 




10 


V W ^ M W 


U . Uo +/ - 


0 . 04 


1 . 02 


+/- 


0.13 






0534H Y673S 






2.24 


+/- 


1.11 






Y673S,V692S 


<0 02 




1,76 


+/- 


U . b J 






Q534H, Y673S, V692S 


<0 . 02 




1.91 


+/- 








C-tezminauL txuncatzion 














15 


N-303 


<0.05 




0.23 


+/- 


0.03 






N-384 


<0.05 




0.25 


+/- 


0.02 






o- chain 


<0 . 04 




0.25 


+/- 


0.03 






Krxngl^ deletion 
















AKl 


<0.002 




<0.03 






20 


AK2 


<0.05 




0.41 


+/- 


0.18 






AK3 


<0.03 




0.56 


+/- 


0.36 






AK4 


<0.07 




0.86 


+/- 


0.46 



25 



* means that the mitogenic activity of the variant did not reach the 
same absolute level as wild- type huHGP. 
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SEQUENCE IiXSTIHG 
(1) GENERAIi INFORMATION: 

5 Ci) APPLICANT: Genentech, Inc., Godowski, Paul J., Lokker, Natalie 

A. r 'iSBX)fL, Melaxiie R. 

(ii) TITIiB OF INVENTIONr HEPATOCXTE GROITTH FACTOR VARIANTS 

10 (iii) NUMBER OF SEQUENCES: 21 

■ (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genezitech, Inc. 

(B) STREET: 460 Point San Brimo Blvd 
15 CC) CIT7: SoutK San Francisco 

(D) STATE t California 

(E) COUNTRY: USA 

(F) ZIP: 94080 

20 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 5.25 inch, 360 Kb floppy disk 

(B) COMPUTER: IBM fiC conpatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: patin (Genentech) 

25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE : 
CO CLASSIFICATION: 

30 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/884811 

(B) FILING DATE r 18-MRY-92 

35 (vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/885971 

(B) FILING DATE: 18-MAY-92 

(viii) ATTORNEY/AGENT INFORMATION: 
40 (A) NAME: Dreger, Ginger R. 

(B) REGISTRATION NUMBER: 33,055 

(C) REFERENCE/DOCKET NUMBER: 755,779P1 

(ix) TELECOMMUNICATION INFORMATION: 
45 (A) TELEPHONE: 415/225-3216 

(B) TELEFAX: 415/952-9881 

(C) TELEX: 910/371-7168 



50 



(2) INFORMATION FOR SEQ ID NOrlc 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4732 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



5 



15 



20 



30 



35 



45 



50 



TTCGAGCTCG CCCGACATTG ATTATTQACT AGTTATTAAT AGTAATCAAT 50 



TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC 100 
10 TTACG6TAAA TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG 150 



ACGTCAATAA TGACGTATGT TCCCATAGTA ACGCCAATAG GGACTTrCCA 200 



TTGACGTCAA TGGGTGQAGT ATTTACGGTA AACTGCCCAC TTGGCAGTAC 250 



ATCAAGTGTA TCATATGCCA AGTACGCCCC CTATTGACGT CAATGACGGT 300 
AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCITAT GGGACTTTCC 350 
25 TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC 400 



GGTTITGGCA GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA 450 



TITCCAAGTC TCCACCCCAT TGACGTCAAT GGGAGnTGT TTTGGCACCA 50 



0 



AAATCAACGG GACTTTCCAA AATGTCGTAA CAACTCCGCC CCATTGACGC 550 
AAATGGGCGG TAGGCGTGTA CGGTGGGAGG TCTATATAAG CAGAGCTCGT 600 
40 TTAGTGAACC GTCAGATCGC CTGGAGACGC CATCCACGCT GTTTTGACCT 650 



CCATAGAAGA CACCGGGACC GATCCAGCCT CCGCGGCCGG GAACGGTGCA 700 



TTGGAACGCG GATTCCCCGT GCCAAGAGTG ACGTAAGTAC CGCCTATAGA 750 



GTCTATAGGC CCACCCCCTT GGCOTCGTTA GAACGCGGCT ACAATTAATA 800 



CATAACCTTA TGTATCATAC ACATACGATT TAGGTGACAC TATAGAATAA 850 
55 CATCCACTTT GCCTITCTCT CCACAGGTGT CCACTCCCAG GTCCAACTGC 900 
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ACCTCGGTTC TATCGATTGA ATTCCCCGGG GATCCTCTAG AGTCGACCTG 950 
CAGAAGCTTG CCTCGAGGCA AGCTTGGCCG CCATGGCCCA ACTTGTTTAT 1000 

5 

TGCAGCTTAT AATGGTTACA AATAAAGCAA TAGCATCACA AATTTCACAA 1050 
10 ATAAAGCATT TTTTTCACTG CATTCTAGTr GTGGITrGTC CAAACTCATC 1100 



15 



20 



25 



30 



35 



40 



45 



50 



AATGTATCTT ATCATGTCTG <5ATCGATCGG GAATTAATTC GGCGCAGCAC 1150 



CATGGCCTGA AATAACCTCT GAAAGAGGAA CTTGGTOAGG TACCTTCTGA 



12 OO 



GGCGGAAA6A ACCAGCTGTG GAATGTGTGT CAGTTAGGGT GTGGAAAGTC 1250 
CCCAGGCTCC CCAGCAGGCA GAAGTATGCA AAGCATGCAT CTCAATTAGT 1300 
CAGCAACCAG GTGTGGAAAG TCCCCAGGCT CCCCAGCAGG CAGAAGTATG 1350 
CAAAGCATGC ATCTCAATTA GTCAGCAACC ATAGTCCCGC CCCTAACTCC 1400 
GCCCATCCCG CCCCTAACTC CGCCCAGTTC CGCCCATTCT CCGCCCCATG 1450 
GCTGACTAAT TTTTCTTATT TATGCAGAGG CCGAGGCCGC CTCGGCCTCT 1500 
GAGCTATTCC AGAAGTAGTG AGGAGGCTTT. TTTGGAGGCC TAGGCTTTTG 1550 
CAAAAAGCTG TTAACAGCTT GGCACTGGCC GTCGTTTTAC AACGTCGTGA 1600 
GTGGGAAAAC CCTGGCGTTA CCCAACTTAA TCGCCTTGCA GCACATCCCC 1650 
CCTTCGCCAG CTGGCGTAAT AGCGAAGAGG CCCGCACCGA TCGCCCTTCC 1700 
CAACAGTTGC GTAGCCTGAA TGGCGAAT6G CGCCTCATGC GGTATTTTCT 1750 
CCTTACGCAT CTGTGCGGTA TTTCACACCG CATACGTCAA AGCAACCATA 1800 
GTACGCGCCC TGTAGCGGCG CATTAAGCGC GGCGGGTGTG GTGGTTACGC 1850 
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GCAGCGTGAC CGCTACACTT GCCAGCGCCC TAGCGCCCGC TCCTTTCGCT 1900 
TTCTTCCCTT CCTTTCTCGC CACGTTCGCC GGCTTTCCCC GTCAAGCTCT 1950 

5 

AAATCGGGGG CTCCCTTTAG GGTTCCGATT TAGTGCTTTA CGGCACCTCG 2000 
10 ACCCCAAAAA ACTTGATTTG GGTGATGGTT CACGTAGTGG GCCATCGCCC 2050 

TGATAGACGG TTTTTCGCCC TTTGACGTTG GAGTCCACGT TCTTTAATAG 2100 

15 

TGGACTCTTG TTCCAAACTG GAACAACACT CAACCCTATC TCGGGCTATT 2150 
CTTTTGArrT ATAAGGGATT TTGCCGATTT CGGCCTATTG GTTAAAAAAT 2200 

20 

GAGCTGATTT AACAAAAATT TAACGCGAAT TTTAACAAAA TATTAACGTT 2250 
25 TACAaTTTTA TGGTGCACTC TCAGTACAAT CT6CTCTGAT GCCGCATAGT 2300 

TAAGCCAACT CCGCTATCGC TACGTGACTG GGTCATGGCT GCGCCCCGAC 2350 

30 

ACCCGCCAAC ACCCGCTGAC GCGCCCTGAC GGGCTTGTCT GCTCCCGGCA 2400 
TCCGCTTACA GACAAGCTGT GACCGTCTCC GGGAGCTGCA TGTGTCAGAG 2450 

35 

GTTTTCACCG TCATCACCGA AACGCGCGAG GCAGTATTCT TGAAGACGAA 2500 
40 AGGGCCTCGT GATACGCCTA TTTTTATAGG TTAATGTCAT GATAATAATG 2550 

GTTTCTTAGA C6TCAGGTGG CACTTTTCGG GGAAATGTGC GCGGAACCCC 2600 

45 

TATTTGTTTA TTTTTCTAAA TACATTCAAA TATGTATCCG CTCATGAGAC 2650 
AATAACCCTG ATAAATGCTT CAATAATATT GAAAAAGGAA GAGTATGA6T 2700 

50 

ATTCAACATT TCCGTGTCGC CCTTATTCCC TTTTITGCGG CATTTTGCCT 2750 
55 TCCTGTrTTT GCTCACCCAG AAACGCTGGT GAAAGTAAAA GATGCTGAAG 2800 
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ATCAGTTGGG TGCACGAGTG GGTTACATCG AACTGGATCT CAACAGCGGT 2850 

AAGATCCTTG AGAGTTrTCG CCCCGAAGAA CGTTTTCCAA TGATGAGCAC 2900 

5 

TTrXAAAGTT CTGCTATGTG GCGCGGT3OT ATCCGGTGAT GACGCCGGGC 2950 
10 AAGAGCAACr CGGTCGCCGC ATACACTATT CTCAGAATGA CTTGGTTGAG 3000 

TACrCACCAG TCACAGAAAA GCATCTTAC6 GATGGCATGA CAGTAAGAGA 3050 
ATTATGCAGT GCTGCCAIAA CC3MCGAGTGA TAACACTGCG GCCAACTTAC 3100 
TTCT6ACAAC GATCGGAGGA CCGAAGGAGC TAACCGCTTT TTTGCACAAC 3150 

20 

ATGGGGGATC ATGTAACTCG CCTTGATCGT TGGGAACCGG AGCTGAATGA 3200 
25 AGCCATACCA AACGACGAGC GTGACACCAC GATGCCAGCA GCAATGGCAA 3250 

CAACGTTGCG CAAACTATTA ACTGGCGAAC TACTTACTCT AGCTTCCCGG 3300 

30 

CAACAATTAA TAGACTGGAT GGAGGCGGAT AftAGTTGCM GACCACTTCT 3350 
GCGCTCGGCC CITCCGGCTG GCTGGTTTAT TGCTGATAAA TCTGGAGCCG 3400 
GTGAGCGTGG GTCTCGCGGT ATCATTGCAG CACTGGGGCC AGATGGTAAG 3450 
40 CCCTCCCGTA TCGTAGTTAT CTACACGACG GG6AGTCAGG CAACTATGGA 3500 

TGAAC6AAAT AGACAGATCG CTGAGATAGG TGCCTCACTG ATTAAGCATT 3550 

45 

GGTAACTGTC AGACCAAGIT TACTCATATA TACTTTAGAT TGATTTAAAA 3600 
CTTGATTTTT AATTTAAAAG GATCTAGGTG AAGATCCTTT TTGATAATCT 3650 

50 

CATGACCAAA ATCCCTTAAC GTGAGTTTTC GTTCCACTGA GCGTCAGACC 3700 
55 CCGTAGAAAA GATCAAAGGA TCTTCTTGAG ATCCTTTTTT TCTGCGCGTA 3750 

48 
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ATCTGCTGCT TGCAAACAAA AAAACCACCG CTACCAGCGG TGGTTTGTTT 3800 



5 



15 



20 



30 



35 



45 



50 



GCCGGATCAA GAGCTACCAA CTCnTTTCC GAAGGTAACT GGCTTCAGCA 3850 



GAGCGCAGAT ACCAAATACT GTCCTTCTAG TGTAGCCGTA GTTAGGCCAC 3900 



10 CACTTCAAGA ACTCTGTAGC ACCGCCTACA TACCTCGCTC TGCTAATCCT 3950 



GTTACCAGTG QCTGCTGCCA GTGGCGATAA GTCGTGTCTT ACCGGGTTGG 4000 



ACTCAAGACG ATAGTTACCG GATAAGGCGC AGCGGTCGGG CTGAACGGGG 4050 



GGTTCGTGCA CACAGCCCAG CTTGGAGCGA ACGACCTACA CCGAACTGAG 4100 



ATACCTACAG CGTGAGCATT GAGAAAGCGC CACGCTTCCC GAAGGGAGAA 4150 
25 AGGCGGACAG GTATCCGGTA AGCGGCAGGG TCGGAACAGG AGAGCGCACG 4200 



AGGGAGCrrC CAGGGGGAAA CGCCTGGTAT CTTTATAGTC CTGTCGGGTT 4250 



TCGCCACCTC TGACTTGAGC GTCGATTTTT GTGATGCTCG TCAGGGGGGC 4300 



GGAGCCTATG GAAAAACGCC AGCAACGCGG CCmTTACG GTTCCTGGCC 4350 



TTTTGCTGGC CTTTTGCTCA CATGTTCTTT CCTGCGTTAT CCCCTGATTC 4400 



40 TGTGGATAAC CGTATTACCG CCTTTGAGTG AGCTQATACC GCTCGCCGCA 4450 



GCCGAACGAC CGAGCGCAGC GAGTCAGTGA GCGAGGAAGC GGAAGAGCGC 4500 



CCAATACGCA AACCGCCTCT CCCCGCGCGT TGGCCGATTC ATTAATCCAG 4550 



CTGGCACGAC AGGTITCCCG ACTGGAAAGC GGGCAGTGAG CGCAACGCAA 4600 



TTAATGTGAG TTACCTCACT CATTAGGCAC CCCAGGCTTT ACACTTTATG 4650 



55 CTTCCGGCTC GTATGTTGTG TGGAATTGTG AGCGGATAAC AATTTCACAC 4700 
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AGGAAACAGC TATGACCATG ATTACC3AATT AA 4732 



PCr/US93/04648 



5 (2) INFORMATION FOR SEQ ID NO:2: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 bases 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQDENCE DESCRIPTION: SEQ ID NO;2: 

15 

TTGGAATCCC AITTACAACC TCGACTTGTT TCGTTTTGGC ACAAGAT 47 



20 (2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bases 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNBSS : single 

(D) TOPOLOGY: linear. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

30 

GAATCCCATT TACGACGTCC AATIGTTTCG 30 



35 (2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 bases 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

45 

CCCATTTACA ACTGCCAATT GTTTCG 26- 



50 (2) INFORMATION FOR SEQ ID N0:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 bases 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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15 



20 
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AGAAGGGAAA CAGT6TCGTG CA 22 



(2) INFORMATION FOR SEQ ID NO: 6 

10 (i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 22 bases 

(B) TYPE: nucleic acid 
<C) STRANDEDMSSS : single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 



AGTGGGCCAC CAGAATCCCC CT 22 



(2) INFORMATION FOR SEQ ID NO: 7: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 



TCCACGACCA GGAGAAATGA CAC 23 



(2) INFORMATION FOR SEQ ID NO: 8: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 



GCATTCAACT TCTC3AGTTTC TAATGTAGTC 30 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 bases 
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xo 



40 



(B) TyPE: nucleic acid 

(C) STRANDEDNESS r single ' 

(D) TOPOIiCXSY: linear " 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0r9: 



CATAGTATTG TCAGCTTCAA CTTCTGAACA 30 



(2) INFORMATION FOR SEQ ID NOrlOr 

(i) SEQUENCE CHARACTERISTICS : 
15 (A) LENGTH r 30 bases 

(B) TYPEr nucleic acid 

(C) STRANDEDNESS: single 
(D> TOPOLOGT: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NOilOj 



TCCATGTGAC ATATCTTCAG TTGTTTCCAA 30 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) I£NGTH: 30 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) . TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 



TGTGGTATCA CCTTCATCIT GTCCATGTGA 30 



(2) INFORMATION FOR SEQ ID NO: 12 r 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 bases 

(B) TlTPEr nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear - 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 



ACCTTGQATG CATTAAGTTG TTTC 24 



55 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 bases 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



15 



25 



30 



40 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 



TTGTCCATGT GATTAATCAC AGT 23 



(2) INFORMATION FOR SEQ ID NO: 14 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 35 bases 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 



GTTCGTGTTG GGATCCCATT TACCTATCGC AATTG 35 



(2) INFORMATION FOR SEQ ID NO: 15 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10596 bases 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 



TTCGAGCTCG CCCGACATTG ATTATTGACT AGTTATTAAT AGTAATCAAT 50 
45 TACGGGGTCA TrAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC 100 



55 



TTACGGTAAA TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG 150 



ACGTCAATAA TGACGTATGT TCCCATAGTA ACGCCAATAG GGACTTTCCA 200 



TTGACGTCAA TGGGTGGAGT ATPTACGGTA AACTGCCCAC TTGGCAGTAC 250 
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ATCAAGTGTA TCATATGCCA AGTACGCCCC CTATTGACGT CAATGACGGT 300 



AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCTTAT GGGAGTTTCC 350 

•5 

TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC 400 
10 GGTTTTGGCA GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA 450 

TTTCCAAGTC TCCACCCCAT TGACGTCAAT GGGAGTTTGT TTTGGCACCA 500 

15 

AAATCAACGG GACTTTCCAA AATGTCGTAA CAACTCCGCC CCATTGACGC 550 

AAATGGGCGG TAGGCGTGTA CGGTGGGAGG TCTATATAAG CAGAGCTCGT 600 
20 * 

TTAGTGAACC GTCAGATCGG CTGGAGACGC CATCCACGCT GTTTTGACCT 650 
25 CCATAGAA6A CACCGG6ACC GATCCA6CCT CCGCGGCCGG GAACGGTGCA 700 

TTGGAACGCG GATTCCCCGT GCCAAGAGTG ACGTAAGTAC CGCCTATAGA 750 

30 , 

GTCTATAGGC CCACCCCCTT GGCTTCGTTA GAACGCGGCT ACAATTAATA 800 
CATAACCTTA TGTATCA13VC ACATACGATT TAGGTGACAC TATAGAATAA 850 

35 

CATCCAGTTT GCCTTTCTCT CCACAGGTGT CCACTCCCAG GTCCAACTGC 900 
40 ACCrCGGTTC TATCGATTCT CGAGAATTAA TTCAAGCTTG CGGCCGCAGC 950 

TTGGCCGCCA TGGCCCAACT TGTTTATTGC AGCTTATAAT GGTTACAAAT 1000 

45 

AAAGCAATAG CATCACAAAT TTCACAAATA AAGCATTTTT TTCACTGCAT 1050 
TCTACTTGIG: GTTTGTCCAA ACTCATCAAT GTATCTTATC ATGTCTGGAT 1100 

50 

CGATCGGGAA TTAATTCGGC GCAGCACCAT GGCCTGAftAT AACCTCTGAA 1150 



55 AGAG6AACTT GGTTAGGTAC CTTCTGAGGC GGAAAGAACC AGCTGTGGAA 1200 
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TGTGTGTCAG TTAGGGTGTG GAAAGTCCCC AGGCTCCCCA GCAGGCAGAA 1250 



GTATGCAAAG CATGCATCTC AATTAGTCAG CAACCAGGTG TGGAAAGTCC 1300 

5 

CCAGGCTCCC CAGCAGGCAG AAGTATGCAA AGCATGCATC TCAATTAGTC 1350 



10 AGCAACCATA GTCCCGCCCC TAACTCCGCC CATCCCGCCC CTAACTCCGC 14 00 

CCAGTTCCGC CCATTCTCCG CCCCATGGCT GACTAATTTT TTTTATTTAT 1450 

15 

GCAGAGGCCG AGGCCGCCTC GGCCTCTGAG CTATTCCAGA AGTAGTGAGG 1500 
AGGCTTTTTT GGAGGCCTAG GCTTTTGCAA AAAGCTGTTC ACGTGATGAA 1550 

20 

TTCTCATGTT TGACAGCTTA TCATCGATAG ATCCTCACAG GCCGCACCCA 1600 



25 GCTTTTCTTC CGTTGCCCCA GTAGCATCTC TGTCTGGTGA CCTTGAAGAG 1650 

GAAGAGGAGG GGTCCCGAGA ATCCCCATCC CTACCGTCCA GCAAAAAGGG 1700 

30 

GGACGAGGAA TTTGAGGCCT GGCTTGAGGC TCAGGAC6CA AATCTTGAGG 1750 
ATGTTCAGCG GGAGTTTTCC GGGCTGCGAG TAATTGGTGA TGAGGACGAG 1800 

35 

GATGGTTCGG AGGATGGGGA ATTTTCAGAC CTGGATCTGT CTGACAGCGA 1850 
40 CCATGAAGGG GATGAGGGTG GGGGGGCTGT TGGAGGGGGC AGGAGTCTGC 1900 

ACTCCCTGTA TTCACTGAGC GTCGTCTAAT AAAGATGTCT ATTGATCTCT 1950 

45 

TTTAGTGTGA. ATCATGTCTG ACGAGGGGCC AGGTACAGGA CCTGGAAATG 2000 
GCCTAGGAGA GAAGGGAGAC ACATCTGGAC CAGAAGGCTC CGGCGGCAGT 2050 

50 

GGACCTCAAA GAAGAGGGGG TGATAACCAT GGACGAGGAC GGGGAAGAGG 2100 



55 ACGAGGACGA GGAGGCGGAA GACCAGGAGC CCCGGGCGGC TCAGGATCAG 2150 
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GGCCAAGACA TAGAGATGGT GTCCGGAGAC CCCAAAAACG TCCAAGTTGC 2200 



5 



15 



20 
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ATTGGCTGCA AAGGGACCCA CGGTGGAACA GGAGCAGGAG CAGGAGCGGG 2250 



AGGGGCAGGA GCAGGAGGG6 CAGGAGCAG6 AGGAGGGGCA GGAGCAGGAG 2300 
10 GAGGGGCAGG AGGGGCAGGA GGGGCAGGAG GGGCAGGAGC AGGAGGAGGG 2350 



GCAGGAGCAG GAGGAGGGGC AGGAGGGGCA GGAGGGGCAG GAGCAGGAGG 2400 
AGGGGCAGGA GGAGGAGGAG GGGCAGGAGG GGCAGGAGCA GGAG6AGGGG 2450 
GAGGAGGGGC AGGAGGGGCA GGAGCAGGAG GAGGGGCAGG AGCAGGAGGA 2500 
GGGGCAGGAG GGGCAGGAGC AGGAGGAGGG GCAGGAGGGG GAGGAGGGGC 2550 
25 AGGAGCAGGA GGAGGGGCAG GAGCAGGAGG GGCAGGAGGG GCAGGAGGGG 2600 — 



CAGGA6CAGG AGGGGCAGGA GGAGGAGGAG GGGCAGGAGG GGCAGGAGGG 2650 
GCAGGAGCAG GAGGGGCAGG AGCAGGAGGG GCAGGAGCAG GAGGGGCAGG 2700 
AGCAGGAGGG GCAGGAGGGG CAGGAGCAGG AGGGGCAGGA GGGGCAGGAG 2750 
GAGGAGGGGC AGGAGGGGCA GGAGCAGGAG GAGGGGCAGG AGGGGCAGGA 2800 
40 GGAGGAGGAG GGGCAGGAGG GGCAGGAGCA GGAGGGGCAG GAGGGGCAGG 2850 



AGCAGGAGGG GCAGGAGGGG CAGGAGCAGG AGGGGCAGGA GGGGCAGGAG 2900 



CAGGAGGAGG GGCAGGAGCA GGAGGGGCAG GAGCAGGAGG TGGAGGCCGG 2950 



GGTCGAGGAG GCAGTGGAGG CCGGGGTCGA GGAGGTAGTG GAGGCCGGGG 3000 



TCGAGGAGGT AGTGGAGGCC GCCGGGGTAG AGGACGTGAA AGAGCCAGGG 3050 
55 GGGGAAGTCG TGAAAGAGCC AGGGGGAGAG 6TCGTGGACG TGGAGAAAAG 3100 
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AGGCCCAGGA GTCCCAGTAG TCAGTCATCA TCATCCGGGT CTCCACCGCG 3150 



5 



CAGGCCGCCT CCAGGTAGAA GGCCATTTTT CCACCCTGTA GGGGAAGCCG 3200 



ATTATTTTGA ATACCACCAA GAAGGTGGCC CAGATGGTGA GCCTGACGTG 3250 



10 CCCCCGGGAG CGATAGAGCA GGGCCCCGCA GATGACCCAG GAGAAGGCCC 3300 



15 
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AAGCACTGGA CCCCGGGGTC AGGGTGATGG AGGCAGGCGC AAAAAAGGAG 3350 



GGTGGTTTGG AAAGCATCGT GGTCAAGGAG GTTCCAACCC GAAATTTGAG 3400 



AACATTGCAG AAGGTTTAAG AGCTCTCCTG GCTAGGAGTC ACGTAGAAAG 3450 



GACTACCGAC GAAGGAACTT GGGTCGCCGG TGTGTTCGTA TATGGAGGTA 3500 
25 GTAAGACCTC CCTTTACAAC CTAAGGCGAG GAACTGCCCT TGCTATTCCA 3550 



CAAT^STCGTC TTACACCATT GAGTCGTCTC CCCTTTGGAA TGGCCCCTGG 3600 



ACCCGGCCCA CAACCTGGCC CGCTAAGGGA GTCCATTGTC TGTTATTTCA 3650 



TGGTCTTTTT ACAAACTCAT ATATTTGCTG AGGTnTGAA GGATGCGATT 3700 



AAGGACCTTG TTATGACAAA GCCCGCTCCT ACCTGCAATA TCAGGGTGAC 3750 
40 TGTGTGCAGC TTTGACGATG QAGTAGATTT GCCTCCCTGG TTTCCACCTA 3800 



TGGTGGAAGG GGCTGCCGCG GAGGGTGATG ACGGAGATGA CGGAGATGAA 3 850 



GGAGGTGATG GAGATGAGGG TGAGGAAGGG CAGGAGTGAT GTAACTTGTT 3900 



AG6AGACGCC CTCAATCGTA TTAAAAGCCG TGTATTCCCC CGCACTAAAG 3950 



AATAAATCCC CAGTAGACAT CATGCGTGCT GTTGGTGTAT TTCTGGCCAT 4000 
55 CTGTCrTGTC ACCATTTTCG TCCTCCCAAC ATGGGGCAAT TGGGCATACC 4050 
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CATGTTGTCA CGTCACTCAG CTCCGCGCTC AACACCTTCT CGCGTTGGAA 4100 



AACATTAGCG ACATTTACCT GGTGAGCAAT CAGACATGCG ACGGCTTTAG 4150 

5 

CCTGGCCTCC TTAAATTCAC CTAAGAATGG GAGCAACCAG CATGCAGGAA 4200 
10 AAGGACAAGC A6CGAAAATT CACGCCCCCT TGGGAGGTGG CGGCATATGC 4250 

AAAGGATAGC ACTCCCACTC TACTACTGGG TATCATA3X3C TGACTGTATA 4300 

15 

T6CATGAGGA TA6CATATOC TACCCGGA'^ CAGATTAGGA TAGCATATAC 4350 
TACCCAGATA TAGATTAGGA TAGCATATGC TACCCAGATA TAGATTAGGA 4400 

20 

TAGCC3ATGC TACCCAGATA TAAATTAGGA TAGCATATAC TACCCAGATA 4450 



25 TAGATTAGGA TAGCATATGC TACCCAGATA TAGATTAGGA TAGCCTATGC 4500 

TACCCAGATA TAGATTAGGA TAGCATATGC TACCCAGATA TAGATTAGGA 4550 

30 

TAGCATATGC TATCCAGATA TTTGGGTAGT ATATGCTACC CAGATATAAA 4600 
TTAGGATAGC ATATACTACC CTAATCTCTA TTAGGATAGC ATATGCTACC 4650 

35 

CGGATACAGA TTAGGATAGC ATATACTACC CAGATATAGA TTAGGATAGC 4700 
40 ATATGCTACC CAGATATAGA TTAGGATAGC CTATGCTACC CAGATATAAA 4750 

TTAGGATAGC ATATACTACC CAGATATAGA TTAGGATAGC ATATGCTACC 4800 

45 

CAGATATAGA TTAGGATAGC CTATGCTACC CAGATATAGA TTAGGATAGC 4850 

■ 

ATATGCTATC CAGATATTTG 6GTAGTATAT GCTACCCATG 6CAACATTA6 4900 

50 

CCCACCGTGC TCTCAGCGAC CTCGTGAATA TGAGGACCAA CAACCCTGTG 4950 



55 CTTGGCGCTC AGGCGCAAGT GTGTGTAATT TGTCCTCCAG ATCGCAGCAA 5000 
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TCGCGCCCCT ATCTTGGCCC GCCCACCTAC TTATGCAGGT ATTCCCCGGG 5050 



5 



GTGCCATTAG TGGTTTTGTG GGCAAGTGGT TTGACCGCAG TGGTTAGCGG 5100 



GGTTACAATC AGCCAAGTTA TTACACCCTT ATTTTACAGT CCAAAACCGC 5150 



10 AGGGCGGCGT GTGGGGGCTG ACGCGTGCCC CCACTCCACA ATTTCAAAAA 5200 
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AAAGAGTGGC CACTTGTCTT TGTTTATGGG CCCCATrGGC GTGGAGCCCC 5250 



GTTTAATTTT CGGGGGTGTT AGAGACAACC AGTGGAGTCC GCTGCTGTCG 53 



00 



GCGTCCACTC TCTTTCCCCT TGTTACAAAT AGAGTGTAAC AACATGGTTC 5350 



ACCTGTCTTG GTCCCTGCCT GGGACACATC TTAATAACCC CAGTATCATA 5400 



25 TTGCACTAGG ATTATGTGTT GCCCATAGCC ATAAATTCGT GTGAGATGGA 5450 



CATCCAGTCT TTACGGCTTG TCCCCACCCC ATGOATITCT ATTGTTAAAG 5500 



ATA1TCAGAA TGTTTCATTC CTACACTAGT ATTTATTGCC CAAGGGGTTT 5550 



GTGAGGGTTA TATTGGTGTC ATAGCACAAT GCCACCACTG AACCCCCCGT 5600 
CCAAATTTTA TTCTGGGGGC GTCACCTGAA ACCTTGTTTT CGAGCACCTC S650 
40 ACATACACCT TACTGTTCAC AACTCAGCAG TTATTCTATT AGCTAAACGA 57 OO 



AGGAGA^LTGA AGAAGCAGGC GAAGATTCAG GAGAGTTCAC TGCCCGCTCC 5750 



TTGATCTTCA GCCACTGCCC TTGTGACTAA AATGGTTCAC TACCCTCGTG 5800 



QAATCCTGAC CCCATGTAM TAAAACC6TG ACAGCTCATG GGGTGGGAGA 5850 



TATCGCTGTT CCTTAGGACC CnTTACTAA CCCTAATTCG ATAGCATATG 5900 



55 CTTCCCGTTG GGTAACATAT GCTATTGAAT TAGGGTTAGT CTG6ATAGTA 5950 
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TAtACTACTA CCCGGGAAGC ATATGCTACC CGTTTAGGGT TAACAAGGGG 6000 
GCCTTATAAA CACTATTGCT AATGCCCTCT TGAGGGTCCG CTTATCGGTA 6050 

5 

GCTACaCAGG CCCCTCTGAT TGACGTTGGT GTAGCCTCCC GTAGTCTTCC 6100 
10 TGGGCCCCTG GGAGGTACAT GTCCCCCAGC ATTGGTGTAA GAGCTTCAGC 6150 

CA&GAGTTAC ACATAAAGGC AATGTTGTGT TGCAGTCCAC AGACTGCAAA 6200 

15 

GTCTGCTCCA GGATGAAAGC CACTCAGTGT TGGCAAATGT 6CACATCCAT 6250 
TTATAAGGAT GTCAACTACA GTCAGAGfiAC CCCTTTGTGT TTGGTCCCCC 6300 

20 

CCCGTGTCAC ATGXGGAACA^ GGGCCCA6TT GGCAAGTTGT ACCAACCAAC 6350 
25 TGAAGGGATT ACATGCACTG CCCGTGACCA ATACAAAACA AAAGCGCTCC 6400 

TCGTACCAGC GAAtSAAGGGG CAGAGATGCC GTAGTCAGGT TTAGTTCGTC 6450 

30 

CGGCGGCGGG GGATCCGCCA GAAATCCGCG CGGTGGTITT TGGGGGTCG6 6500 
GGGTGTTTGG CAGCCACAGA CGCCCGGTGT TCGTGTCGCG CCAGTACATG 6550 

35 

CGGTCCATGC CCAGGCGATC CAAAAACCAT GGGTCTGTCT GCTCAGTCCA 6600 
40 GTCGTGGACC T6ACCCCACG CAACGCCCAA AAGAATAACC CCCACGAACC 6650 

ATAAACCATT CCCCATGGGG GACCCCGTGC CTAACCCACG GGGCCCGTGG 6700 

45 

CTATGGCGGG CTTGCCGCCC CGACGTTGGC TGCGAGCCCT GGGCCTTCAC 6750 
CCGAACTTGG GGGTTGGGGT GGGGAAAAGG AAGAAACGCG GGCGTATTGG 6800 

50 

CCCCAATGGG GTCTCGGTGG GGTATCGACA GAGTGCCAGC CCTGGGACCG 6850 
55 AACCCCGCGT TTATGAACAA ACGACCCAAC ACCCGTGCGT TTTATTCTGT 6900 

60 
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CTTTTTATTG CCGTCATAGC GCGGGTTCCT TCCGGTATTG TCTCCTTCCG 6950 



TGTTTCAGTT AGCCTCCCCC ATCTCCCGGG GTGGGCGAA6 AACTCCAGCA 7000 

5 

TGAGATCCCC GCGCTGGAGG ATCATCCAGC CGGCGTCCCG GAAAACGATT 7050 
10 CCGAAGCCCA ACCTTTCATA GAAGGCGGCG GTGGAATCGA AATCTCGTGA 7100 

TG6CAGGTTG GGCGTCGCTT GGTCGGTCAT TTCGAACCCC AGAGTCCCGC 7150 

15 

TCAGAAGAAC TCGTCAAGAA GGCGATAGAA GGCGATGCGC TGCGAATCGG 7200 
GAGCGGCGAT ACCGTAAAGC ACGAGGAAGC GGTCAGCCCA TTCGCCGCCA 7250 

20 

AGCTCTTCAG CAATATCACG GGTAGCCAAC GCTATGTCCT GATAGCGGTC 7300 
25 CGCCACACCC AGCCGGCCAC AGTCGATGAA TCCAGAAAAG CGGCCATTTT 7350 

CCACCATGAT ATTCGGCAAG CAGGCATCGC CATGGGTCAC_GACGAGATCC 7400 

30 

TCGCCGTCGG GCATGCGCGC CTTGAGCCTG GCGAACAGTT CGGCTGGCGC 7450 
GAGCCCCTGA TGCTCTTCGT CCAGATCATC CTGATCGACA AGACCGGCTT 7500 

35 

CCATCCGAGT ACGTGCTCGC TCGATGCGAT GTTTCGCTTG GTGGTCGAAT 7550 
40 GGGCAGGTAG CCGGATCAAG CGTATGCAGC CGCCGCATTG CATCAGCCAT 7600 

GATGGATACT TTCTCGGCAG GAGCAAGGTG AGATGACAGG AGATCCTGCC 7650 

45 

CCGGCACTTC GCGCAATAGC AGCCAGTCCC TTCCCGCTTC AGTGACAACG 7700 
TCGAGGACAG CTGCGCAAGG AACGCCCGTC GTGGCCAGCC ACGATAGCCG 7750 

50 

CGCTGCCTCG TCCTGCAGTT CATTCAGGGC ACCGGACAGG TCGGTCTTGA 7800 



55 CAAAAA6AAC CGGGCGCCCC TGCGCTGACA GCCGGAACAC GGCGGCATCA 7850 
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GAGCAGCCGA TTGTCTGTTG TGCCCAGTCA TAGCCGAATA GCCTCTCCAC 7900 



CCAAGCGGCC GGAGAACCTG CGTGCAATCC ATCTTGTTCA ATCATGCGAA .7950 

5 

ACGATCCTCA TCCTGTCTCT TGATCAGATC TGCGGCACGC TGTTGACGCT 8000 
10 GTTAAGCGGG TCGCTGCAGG GTCGCTCGGT GTTCGAGGCC ACACGCGTCA 8050 

CCTTAATATG CGA21GTGGAC CTGGGACCGC GCCGCCCCGA CTGCATCTGC 8100 

15 

CxTGTTCGAAT TCATCAAAGC AACCATAGTA CGCGCCCTQT AGCGGC6CAT ai50 
TAAGCGCGGC GGGTGTGGTG GTTACGCGCA GCGTGACC6C TACACTTGCC 8200 

20 

AGCGCCCTAG CGCCCGCTCC TTTCGCTTTC TTCCCTTCCT TTCTCGCCAC 8250 
25 GTTC6CC66C TTTCCCCGTC AAGCTCTMA TCGGGGGCTC CCTTIAGGGT 8300 

TCCGATTTAG TGCTTTACGG CACCTCGACC CCAAAAAACT TGATTTGGGT 8350 

30 

GATGGTTCAC GTAGTGGGCC ATCGCCCTGA TAGACGGTXT TTCGCCCTTT 8400 
GACGTTGGAG TCCACGTTCT TTAATACTG6 ACTCTTGTTC CAAACIG6AA 8450 

35 

CAACACTCAA CCCTATCTCG GGCTATTCTT TTGATTTATA AGGGATTTTG 8500 
40 CC6ATTTCGG CCTATTGGTT AAAAAATGAG CTGATTTAAC AAAAATTTAA 8550 

CGC6AATTTT AACAAAATAT TAACGTTTAC AATTTTATGG TGCAGGCCTC 8600 

45 

GTGATACGCC TATTTTTATA GGTTAATGTC ATGATAATAA TGGTTTCTTA 8650 
GAGGTCAGGT GGCACTTTTC GGGGAAATGT GC6CGGAACC CCTATTT6TT 8700 

50 

TATTTTTdA AATACATTCA AATATGTATC CGCTCATGAG ACAATAACCC 8750 



55 TGAIAAATGC TTCAATAATA TTGAAAAAGG AAGAGTFATGA 6TATTCAACA 8800 
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TTTCCGTGTC GCCCTTATTC CCnTTTTGC GGCATTTTGC CTTCCTGTrT 8850 



5 



TTGCTCACCC AGAAACGCTG GTGAAAGTAA AAGATGCTGA AGATCAGTTG 8900 
GGTGCACGAG TGGGTTACAT CGAACTGGAT CTCAACAGCG GTAAGATCCT 8950 
10 TGAGAGTTTT CGCCCCGAAG AACGTTTTCC AATGATGAGC ACTTTTAAAG 9000 



15 



20 



30 



35 



45 



50 



TTCTGCTATG TGGCGCGGTA TTATCCCGTG ATGACGCCGG GCAAGAGCAA 9050 



CTCGGTCGCC GCATACACTA TTCTCAGAAT GACTTGGTrG AGTACTCACC 9100 



AGTCACAGAA AAGCATCTTA CGGATGGGAT GACAGTAAGA GAATTATGCA 9150 
GTGCTGCCAT AACCATGAGT GATAACACTQ CGGCCAACTT ACTTCTGACA 9200 
25 ACGATCGGAG GACCGAAGGA GCTAACCGCT TTTTTGCACA ACATGGGGGA 9250 



TCATGTAACT CGCCTTGATC 6TTGGGAACC GGAGCTGAAT GAAGCCATAC 9300 
CAAACGACGA GCGTGACACC ACGATGCCAG CAGCAATGGC AACAACGITG 9350 
CGCAAACTAT TAACTGGCGA ACTACTTACT CTAGCTTCCC GGCAACAATT 9400 
AATAGACTGG ATGGAGGCGG ATAAAGTTGC AGGACCACTT CTGCGCTCGG 9450 
40 CCCTTCCGGC TGGCTGGTTT ATTGCTGATA AATCTGGAGC CGGTGAGCGT 9500 



GGGTCTCGCG GTATCATTGC AGCACTGGGG CCAGATGGTA AGCCCTCCCG 9550 



TATCGTAGTT ATCTACACGA CGGGGAGTCA GGCAACTATG GATGAACGAA 9600 



ATAGACAGAT CGCTGAGATA GGTGCCTCAC TGATTAAGCA TTGGTAACTG 9650 



TCAGACCAAG TTTACTCATA TATACTTTAG ATTGATTTAA AACTTGATIT 9700 
55 TTAATTTAAA AGGATCTAGG TGAAGATCCT TTTTGATAAT CTCATGACCA 9750 
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AAATCCCTTA ACGTGAGlTr TCGTTCCACT GAGCGTCAGA CCCCGTAGAA 9800 
AAGATCAAAG GATCTTCTTG AGATCCTTTT TTTCTGCGCG TAATCTGCTG 9850 

5 

CTTGCAAACA AAAAAACCAC CGCTRCCAGC GOTGGTTTGT TTGCCGGATC 9900 
10 AAGAGCTACC AACTCTTTTT CCGAAGGTAA CTGGCTTCAG CAGAGCGCAG 9950 

ATACCAAATA CTGTCCTTCT AGTGTAGCCG TAGTTAGGCC ACCACTTCAA 10000 

15 

GAACrCTGTA GCACCGCCTA CATACCTCGG TCTGCTAATC CTGTTACCAG 10050 
TGGCTGCTGC CAGTGGC6AT AAGTCGTGTC TTACCGGGTT GGACTCAAGA 10100 

20 

CGATAGTTAC CGGATAAGGC GCAGCGGTCG GGCTGAACGG GGGGTTCGTG 10150 
25 CACACAGCCC AGCTTGGRGC GAACGACCTA CACCGAACTG AGATACCTAC 10200 

AGCJ3TGAGCA TTGAGAAAGC GGCACGCTTC CCGAAGGGAG AAA6GCGGAC 10250 

30 

AG6TATCCGG lAAGCGGCAG GGTCGGAACA GGAGAGCGCA CGAGGGAGCT 10300 
TCCAGGGGGA AACGCCTGGT ATdTTATAG TCCTGTCGGG TTTCGCCACC 10350 

35 

TCTGACTTGA GCGTCGATTT TTGTGATGCT CGTCASGGGG GCGGAGCCTA 10400 
40 TGGAAAAACG CCAGCTGGCA CGACAGGTTT CCCGACTGGA AAGCGGGCAG 10450 

TGAGCGCAAC GCAATTAATG TGAGTTACCT CACTCATTAG GCACCCCAGG 10500 

45 

CTTTACACrT TATGCTTCCG GCTCGTATGT TGTGTGGAAT TGTGAGCGGA 10550 

TAACAATTTC ACACfiGGAAA CA6CTAT6AC CATGATTACG AATTAA 10596 
50 - 

(2) INFORMATION FOR SEQ ID NO: 16: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENCnii: 51 bases 

64 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



10 



25 



30 



40 



45 



ATGAAGGCCC CCGCTGTGCT TGCACCTGGC ATCCTCGTGC TCCTGTTTAC 50 



C 51 



15 (2) INFORMATION FOR SEQ ID NO: 17 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 56 bases 
<B) TYPE: nucleic acid 
20 (C) STRANDED19ESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 



CACTAGTTAG GATGGGGGAC ATGTCTGTCA GAGGATACTG CACTTGTCGG 50 



CATGAA 56 



(2) INFORMATION FOR SEQ ID NO: 18 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



TAGTACTAGC ACTATGATGT CT 22 



(2) INFORMATION FOR SEQ ID NO: 19: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 
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TTTAeTTCTT GACGGTCCAA AG 22 



(2) INFORMATION FOR SEQ ID NO:20r 

(i) SEQUENCE CHAXIACTERISTICS : 
(A) IiENGTHt 25 bases 
10 (B) TYPE: nucleic acid 

IC) STRANDEDNESS : single 
CD) TOPOI.OGY: linear 



15 



20 



30 



(xl) SEQUENCE DESCRIPTION: SEQ ID N0r20 



CAGGGGGAST TGCAGATTCA GCTGT 25 



(2) INFORMATION FOR SEQ ID NO: 21 



(1) SEQUENCE CHARACTERISTICS 
(A) IiENGTH: 45 bases 
25 (B) TVPE: nucleic acid 

CO STRANDEDNESS: single 
(D) TOPOIiOGY: linear 



(3d.) SEQUENCE DESCRIPTIONt SEQ ID NO: 21 



AGTTTTGTCG GTGACCTGAT CATTCTGATC TGGTTGAACT ATTAC 45 



35 



66 



wo 93/23541 PCr/US93/a4648 

Claims : 

I, A hepatpcyte growth factor (HGF) variant resistant to 
proteolytic cleavage by enzymes that are capable of in vivo conversion 
of HGF into its two- chain form. ■ • 

5 2. The variant of claim 1 which is a variant of human HGF 

(huHGF) : 

3. A hepatocyte growth factor (HGF) variant stabilized in 

single-chain form by site directed mutagenesis within a region 

recognized by an enzyme capable of converting HGF into its two-chain 
10 form. 

4. The variant of claim 3 which is capable of binding an HGF 
receptor . 

5. The variant of claim 4 which is a variant of hiunan HGF 
(huHGF) . 

15 6. The variant of claim 5 having an amino acid alteration at or 

adjacent to amino acid positions 493, 494, 495 or 496 of the wild- type 
huHGF amino acid sequence. 

* 

7. The variant of claim 6 wherein said alteration is 
substitution. 

20 8. The variant of claim 6 wherein said alteration is insertion 

or deletion. 

9. The variant of claim 6 in which amino acid position 494 is 
occupied by an amino acid other than arginine. 

10. The variant of claim 9 wherein said amino acid is selected 
25 from the group consisting of glutamic acid, aspartic acid and alanine. 

II. The variant of claim 6 wherein said alteration is the 
substitution of at least one amino acid at amino acid positions 493- 
496 of the wild- type hHGF amino acid sequence. 

12. The variant of claim 11 having another amino acid 

30 substituted for arginine at amino acid position 494 of wild- type 
huHGF. 

13. The variant of claim 11 having another amino acid 
substituted for valine at amino acid position 495 of wild- type hioHGF. 

14 . The variant of claim ll having another amino acid 
35 substituted for valine at amino acid position 496 of wild- type huHGF. 

15. The variant of claim 6 retaining substantially full receptor 
binding affinity of wild- type huHGF. 
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16. A hepatocyte . growth factor (HGF) variant having an amino 
acid alteration at a site within the protease domain of HGF and 
retaining substantially fxill receptor binding affinity of the 
corresponding wild- type HGF; 
5 . 17 . The variant of claim 16 comprising an alteration in a region 

corresponding to the catalytic site of serine proteases. 

18. The variant of claim 16 comprising an alteration at or 
adjacent to any of positions 534, 673 and 692 of the wild- type human 
HGF (huHGF) amino acid sequence. 
10 19. The variant of claim 18 wherein said alteration is 

substitution . 

20. The variant of claim 19 having another amino acid 
substituted for glutamine at position 534 of the wild- type huHGF amino 
acid sequence. 

15 21. The VEuriant of claim 20 wherein said amino acid is 

histidine . 

22 - The variant of claim 19 having another amino acid 
substituted for tyrosine at position €73 of the wild-type huHGF amino 
acid sequence . 

20 23. The variant of claim 22 wherein said amino acid is devoid of 

aromatic and heterocyclic moieties. 

24 . The variant of claim 23 wherein said amino acid is selected 
from serine, threonine, asparagine, cysteine, glycine, alanine, and 
valine . 

25 25. . The variant of claim 24 wherein said amino acid is serine. 

26. The variant of claim 19 having another amino acid 
substituted for valine at position 692 of the wild-type h\jHGF amino 
acid sequence. 

30 27. The variant of claim 26 wherein said amino acid is polar, 

28. The variant of claim 27 wherein said amino acid is selected 
from serine, threonine, asparagine and glutamine. 

29 i The vauriant of claim 28 wherein said amino acid is serine. 
30. The variant of claim 24 further con^rising the substitution 
35 of glutamine at position 534 or valine at position 692 of the wild- 
type huHGF amino acid sequence. 
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31. The variant of claim 30 having serine substituted for 
tyrosine at position 673 of the wild- type huHGF amino acid sequence. 

32. The variant of claim 31 having histidine substituted for 
glutamine at position 534 of the wild- type huHGF amino acid sequence. 

5 33, The variant of claim 31 having serine substituted for valine 

at position 692 of the wild- type huHGF amino acid sequence. 

34. The variant of claim 33 additionally having histidine 
substituted for glutamine at position 534 of the wild-type huHGF amino 
acid sequence. 

^° 35. The variant of any of claims 1-15 having an amino acid 

alteration at a site within the protease domain of HGF and retaining 
substantially full receptor binding affinity of the corresponding 
wild- type HGF. 

36. The variant of any of claims 16-34 that is resistant to 
15 proteolytic cleavage. 

37. The variant of any of claims 1-36 that is substantially 
incapable of HGF receptor activation. 

38. The variant of any of claims 1-36 that is substantially 
devoid of HGF hepatocyte growth stimulating activity. 

20 39. The variant of any of claims 1-36 having increased receptor 

binding affinity as con^jared to wild- type huHGF. 

40. The variant of claim 39 wherein the increase in receptor 
binding affinity is accomplished by an alteration in a receptor- 
binding domain of the huHGF amino acid sequence. 

25 41. The variant of claim 40 wherein said alteration is in the 

huHGF Of- chain. 

42. The variant of claim 41 wherein said alteration is within 
the Kringle 1 domain. 

43. The variant of claim 42 wherein said alteration is within 

30 the patch defined by amino acid positions 159, 161, 195 and 197 of the 
wild- type huHGF amino acid sequence. 

44. The variant of claim 42 wherein said alteration is at amino 
acid position 173 of wild- type huHGF. 

45. The variant of claim 41 wherein said alteration is within 
35 the hairpin domain, N-terminal of the hairpin domain, or between the 

hairpin and the Kringle 1 domains of wild- type huHGF. 
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46. The variant of any of claims 1-45 devoid of functional 
Kringle 2 domain, 

47. The varicint of any of claims 1-45 devoid of functional 
Kringle 3 domain. 

5 48. The variant of any of claims 1-45 devoid of functional 

Kringle 4 domain. 

49 . A nucleotide sequence encoding the variant of any of claims 

1-48. 

50. A repllcable escpression vector containing and capable of 
10 expressing in a suitable host cell the nucleotide sequence of claim 

49. 

51. A host cell transformed with the vector, of claim 50. 

52. A process cdmprising culturing the host cells of claim 51 so 
as to express the nucleic acid encoding the HGF variant. 

15 53. The process o£ claim 52 fxirther comprising recovering the 

variant from the host cell culture. 

54 1 A phazmaceutical conposition coznprising a variant of any of 
claims 1-48 in an amount capable of conpetitive inhibition of the 
binding of wild^type. huHGF to its receptorr in admixture with a 
20 pharmaceutical ly acceptable carrier. 

55. A method of treating a pathological condition associated 
with the activation of a huHGF receptor coniprising administering to a 
patient 'in need the composition of claim 54. 
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xModified RK5 derivative, insert Xho I into Hind III site; PJG 
2/89> 

<froin Cori 86.3.18 fix DHFR acc I 

<sequence of CMV enhancer /promoter, from Cell 41, 1985> 
Txfrom pPMLCMV beginning to Hindlll , enhancers and proinoter> 
TCGAGCTCGCCCGACATTGATTA 

TTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCC 

ATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCT 

GACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCC 

ATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTT 

ACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTA 

CGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCC 

CAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATT 

AGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGC 

GTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGAC 

GTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATG 

TCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGT 

GGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACCG 

<Begin RNA> 

TCAGATCGCCTGG 

AGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGACCGATC 

CAGCCTCCGCGGCCGGGAACGGTGCATTGGAACGCGGATTCCCCGTGCCA 

AGAGTGACGTAAGTACCGCCTATAGAGTCTATAGGCCCACCCCCTTGGCT 
T 

><sp6 proinoter> 

<GGCCCACCCCCTTGGCTT>CGTTAGAACGCGGCTACAATTAATACATAACC 

TTATGTATCATACACATACGATTTAGGTGACACTATAxspe RNA 
start>GAATA<ACATCCACTTTGCCTTTC> 

ACATCCACTTTGCCTTTCTCTCC 

ACAGGTGTCCACTCCCAGGTCCAA<PstI-ClaI Converter>CTGCA 

> < c 1 o n i n g 

linker>CCTCGGTTCTATCGATTGAATTCCCCGGGGATCCTCTAGAGTCGACCTGCAGAAGCTT 
GCCTCGAGGCAAGCTT 

GGCCGCCATGGCCC 

XSV4 0 early poly A>AACTTGTTTATTGCAGCTTATAATGGTT 

ACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATT-CTA 

GTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCT 

GGATCGATCGG 

> < s V 4 O 

origin>GAATTAATTCGGCGCAGCACCATGGCCTGAAATAACCTCTGAAAGAGGAACTTGGTTA 
GGTACCTTC 

TG AGGCGG AAAG AACC AG CT 

GTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGC 

AGAAGTATGCAAAGCATGCATCTC7UVTTAGTCAGCAACCAGGTGTGGAAAGTCCCCAGGC 

TCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCG 

CCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCAT 

GGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTC 

GAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAA 

xstart pUC118> 

AAGCTGTTAACAGCTTGGC 

ACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACC 

CTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCCTTCGCCAGCTGGCGTAATA 

FIG. 5A 
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GCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGTAGCCTGAATGGCGAATGGC 

<start M13> 

G CCTGATGCGG 

TATTTTCTCC TTACGCATCT GTGCGGTATT TCACACCGCA TA 

CGTCAAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTG 

TGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCG 

CTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGG 

GGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATT 

TGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGT 

TGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTA 

TCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAA 

ATGAGCTGATTTAACAAAAATTTAACGCGTVATTTTAACAAAATATTAACGTTTACAATTT 



ACTCC GCTATCGCTA 
CCG CTG ACGC GCCCTGACGG 



TATGGTGCAC TCTCAGTACA 

ATCTGCTCTG ATGCCGCATA GTTAAGCCA 
CGTGACTGGG 

TCATGGCTGC GCCCCGACAC CCGCCAACAC 
GCTTGTCTGC 

TCCCGGCATC CGCTTACAGA CAAGCTGTGA CCGTCTCCGG 
TGTCAGAGGT 

TTTCACCGTC ATCACCGAAA CGCGCGAGGC AG 

TATTC 

< Hinc II (2271) to GTCATO 

< Pst I (1973) to CTGCTG> 

< Acc I (183) delete 6 bp> 
<Arbitrarily change EcoRI (i) to GAATAO 
<pUCx 83.11.25 sequence not fully known> 

TTGAAGACGA AAGGGCCTCG TGATACGCCT 

GTTAATGTCA TGATAATAAT GGTTTCTTAG ACGTCAGGTG 
GGGAAATGTG 

CGCGGAACCC 
GCTCATGAGA 

CAATAACCCT 
TATTCAACAT 

TTCCGTGTCG 
TGCTCACCCA 

GAAACGCTGG 
GGGTTACATC 

GAACTGGATC 
ACGTTTTCCA 

ATGATGAGCA 
TGACGCCGGG 

CAAGAGCAAC 
GTACTCACCA 

GTCACAGAAA 
TGCTGCCATA 

ACCATGAGTG 
ACCGAAGGAG 

CTAACCGCTT 
TTGGGAACCG 

GAGCTGAATG 
AGCAATGGCA 

ACAACGTTGC 



GAGCTGCATG 



ATTTTTATAG 
GCACTTTTCG 



CTATTTGTTT ATTTTTCTAA ATACATTCAA ATATGTATCC 

GATAAATGCT TCAATAATAT TGAAAAAGGA AGAGTATGAG 

CCCTTATTCC CTTTTTTGCG GCATTTTGCC TTCCTGTTTT 

TGAAAGTAAA AGATGCTGAA GATCAGTTGG GTGCACGAGT 

TCAACAGCGG TAAGATCCTT GAGAGTTTTC GCCCCGAAGA 

CTTTTAAAGT TCTGCTATGT GGCGCGGTAT TATCCCGTGA 

TCGGTCGCCG CATACACTAT TCTCAGAATG ACTTGGTTGA 

AGCATCTTAC GGATGGCATG ACAGTAAGAG AATTATGCAG 

ATAACACTGC GGCCAACTTA CTTCTGACAA CGATCGGAGG 

TTTTGCACAA CATGGGGGAT CATGTAACTC GCCTTGATCG 

AAGCCATACC AAACGACGAG CGTGACACCA - CGATGCCAGC 

GCAAACTATT AACTGGCGAA CTACTTACTC TAGCTTCCCG 
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GCAACAATTA 

ATAGACTGGA TGGAGGCGGA TAAAGTTGCA GGACCACTTC TGCGCTCGGC 
CCTTCCGGCT 

GGCTGGTTTA TTGCTGATAA ATCTGGAGCC GGTGAGCGTG GGTCTCGCGG 
TATCATTGCA 

GCACTGGGGC CAGATGGTAA GCCCTCCCGT ATCGTAGTTA TCTACACGAC 
GGGGAGTCAG 

GCAACTATGG ATGAACGAAA TAGACAGATC GCTGAGATAG GTGCCTCACT 
GATTAAGCAT 

TGGTAACTGT CAGACCAAGT TTACTCATAT ATACTTTAGA TTGATTTAAA 
ACTTCATTTT 

TAATTTAAAA GGATCTAGGT GAAGATCCTT TTTGATAATC TCATGACCAA 
AATCCCTTAA 

CGTGAGTTTT CGTTCCACTG AGCGTCAGAC CCCGTAGAAA AGATCAAAGG 
ATCTTCTTGA 

GATCCTTTTT TTCTGCGCGT AATCTGCTGC TTGCAAACAA AAAAACCACC 
GCTACCAGCG 

GTGGTTTGTT TGCCGGATCA AGAGCTACCA ACTCTTTTTC CGAAGGTAAC 
TGGCTTCAGC 

AGAGCGCAGA TACCAAATAC TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA 
CCACTTC7VAG 

AACTCTGTAG CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT 
GGCTGCTGCC 

- AGTGGCGATA AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC 
GGAT/^GGCG 

CAGCGGTCGG GCTGAACGGG GGGTTCGTGC ACACAGCCCA GCTTGGAGCG 
AACGACCTAC 

ACCGAACTGA GATACCTACA GCGTGAGCAT TGAGAAAGCG CCACGCTTCC 
CGAAGGGAGA 

AAGGCGGACA GGTATCCGGT AAGCGGCAGG GTCGGAACAG GAGAGCGCAC 
GAGGGAGCTT 

CCAGGGGGAA ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT 
CTGACTTGAG 

CGTCGATTTT TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC 
CAGCAACGCG 

GCCTTTTTAC GGTTCCTGGC CTTTTGCTGG CCTTTTGCTC ACATGTTCTT 
TCCTGCGTTA 

TCCCCTGATT CTGTGGATAA CCGTATTACC GCCTTTGAGT GAGCTGATAC 
CGCTCGCCGC 

AGCCGAACGA CCGAGCGCAG CGAGTCAGTG AGCGAGGAAG CGGAAGAGC 

<end M13> 
GCCCAATACGCAA 

ACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATCCAGCTGGCACGACAGGTTTCCCGA 

CTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTACCTCACTCATTAGGCACC 

CCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACA 

ATTTCACACAGGAAACAGCTATGACCATGATTAC 

GAATTAA 



FIG.5C 
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XpCIS.EBON 

>< assembled by Steve Williams June 1989 
><"poison-minus" pRK 
Xwith EBNA-1, oriP, neoR 
Xpolylinker sites: Xhol, Hindlll, NotI 

XCMV enhancer /promoter 
T 

TCGAGCTCGCCCGACATTGATTA 

TTG<SpeI>ACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCC 

ATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCT 

GACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCC 

ATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTT 

ACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTA 

CGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCC 

CAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATT 

AGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGC 

GTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGAC 

GTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATG 

TCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGT 

GGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACCG 

<Begin RNA> 

TCAGATCGCCTGG 

AGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGACCGATC 

CAGCCTCCGCGGCCGGGAACGGTGCATTGGAACGCGGATTCCCCGTGCCA 

AGAGTGACGTAAGTACCGCCTATAGAGTCTATAGGCCCACCCCCTTGGCT 
T 

Xsp6 promoter> 

<GGCCCACCCCCTTGGCTT>CGTTAGAACGCGGCTACAATTAATACATAACC 

TTATGTATCATACACATACGATTTAGGTGACACTATA><sp6 RNA 
start>GAATA<ACATCCACTTTGCCTTTC> 

ACATCCACTTTGCCTTTCTCTCC 

ACAGGTGTCCACTCCCAGGTCCAA<PstI-ClaI converter>CTGCA 

Xcloning linker>CCTCGGTTCT 
ATCGATTCTCGA 

<EcoRI / kl enow>GAATTAATTC 

AAGCTTGCGGCCGCAGCTT 
GGCCGCCATGGCCC 

Xsv40 early poly A>AACTTGTTTATTGCAGCTTATAATGGTT 

ACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTA 

GTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCT 

GGATCGATCGG 

> < s V 4 0 

origin>GAATTAATTCGGCGCAGCACCATGGCCTGAAATAACCTCTGAAAGAGGAACTTGGTTA 

<KpnI >GGTACCTTC 
TGAGGCGGAAAGAACCAGCT 

GTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGC 
AGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGATIAGTCCCCAGGC 
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TCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCG 
CCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCAT 
GGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTC 
CAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAA 
<start pUC118> 

AAGCTGTT<HpaI end from SW1> 
xdelta 3 

<PCR primer seq; PGR product was blunted & ligated in> CACGTGAT 
<this is EcoRV site remnant> 

<RI cassette: EBNA-oriP-tk term-NeoR-tk prom> 
<4 913 bp EcoRI-Bam from 220_2> 
<EcoRI> 
GAATTCTCAT 

GTTTGACAGC TTATCATCGA TA 
><EBNA-1> 

GATCCTCACA GGCCGCACCC AGCTTTTCTT CCGTTGCCCC AGTAGCATCT CTGTCTGGTG 
ACCTTGAAGA GGAAGAGGAG GGGTCCCGAG AATCCCCATC CCTACCGTCC AGCAAAAAGG 
GGGACGAGGA ATTTGAGGCC TGGCTTGAGG CTCAGGACGC AAATCTTGAG GATGTTCAGC 
GGGAGTTTTC CGGGCTGCGA GTAATTGGTG ATGAGGACGA GGATGGTTCG GAGGATGGGG 
AATTTTCAGA CCTGGATCTG TCTGACAGCG ACCATGAAGG GGATGAGGGT GGGGGGGCTG 
TTGGAGGGGG CAGGAGTCTG CACTCCCTGT ATTCACTGAG CGTCGTCTAA TAAAGATGTC 
TATTGATCTC TTTTAGTGTG AATCATGTCT GACGAGGGGC CAGGTACAGG ACCTGGAAAT 
GGCCTAGGAG AGAAGGGAGA CACATCTGGA CCAGAAGGCT CCGGCGGCAG TGGACCTCAA 
AGAAGAGGGG GTGATAACCA TGGACGAGGA CGGGGAAGAG GACGAGGACG AGGAGGCGGA 
AGACCAGGAG CCCCGGGCGG CTCAGGATCA GGGCCAAGAC ATAGAGATGG TGTCCGGAGA 
CCCCAAAAAC GTCCAAGTTG CATTGGCTGC AAAGGGACCC ACGGTGGAAC AGGAGCAGGA 
GCAGGAGCGG GAGGGGCAGG AGCAGGAGGG GCAGGAGCAG GAGGAGGGGC AGGAGCAGGA 
GGAGGGGCAG GAGGGGCAGG AGGGGCAGGA GGGGCAGGAG CAGGAGGAGG GGCAGGAGCA 
GGAGGAGGGG CAGGAGGGGC AGGAGGGGCA GGAGCAGGAG GAGGGGCAGG AGCAGGAGGA 
GGGGCAGGAG GGGCAGGAGC AGGAGGAGGG GCAGGAGGGG CAGGAGGGGC AGGAGCAGGA 
GGAGGGGCAG GAGCAGGAGG AGGGGCAGGA GGGGCAGGAG CAGGAGGAGG GGCAGGAGGG 
GCAGGAGGGG CAGGAGGAGG AGGAGGGGCA GGAGCAGGAG GGGCAGGAGG GGCAGGAGGG 
GCAGGAGCAG GAGGGGCAGG AGCAGGAGGA GGGGCAGGAG GGGCAGGAGG GGCAGGAGCA 
GGAGGGGCAG GAGCAGGAGG GGCAGGAGCA GGAGGGGCAG GAGCAGGAGG GGCAGGAGGG 
GCAGGAGCAG GAGGGGCAGG AGGGGCAGGA GCAGGAGGGG CAGGAGGGGC AGGAGCAGGA 
GGAGGGGCAG GAGGGGCAGG AGCAGGAGGA GGGGCAGGAG GGGCAGGAGC AGGAGGGGCA 
GGAGGGGCAG GAGCAGGAGG GGCAGGAGGG GCAGGAGCAG GAGGGGCAGG AGGGGCAGGA 
GCAGGAGGAG GGGCAGGAGC AGGAGGGGCA GGAGCAGGAG GTGGAGGCCG GGGTCGAGGA 
GGCAGTGGAG GCCGGGGTCG AGGAGGTAGT GGAGGCCGGG GTCGAGGAGG TAGTGGAGGC 
CGCCGGGGTA GAGGACGTGA AAGAGCCAGG GGGGGAAGTC GTGAAAGAGC CAGGGGGAGA 
GGTCGTGGAC GTGGAGAAAA GAGGCCCAGG AGTCCCAGTA GTCAGTCATC ATCATCCGGG 
TCTCCACCGC GCAGGCCCCC TCCAGGTAGA AGGCCATTTT TCCACCCTGT AGGGGAAGCC 
GATTATTTTG AATACCACCA AGAAGGTGGC CCAGATGGTG AGCCTGACGT GCCCCCGGGA 
GCGATAGAGC AGGGCCCCGC AGATGACCCA GGAGAAGGCC CAAGCACTGG ACCCCGGGGT 
CAGGGTGATG GAGGCAGGCG CAAAAAAGGA GGGTGGTTTG GAAAGCATCG TGGTCAAGGA 
GGTTCCAACC CGAAATTTGA GAACATTGCA GAAGGTTTAA GAGCTCTCCT GGCTAGGAGT 
CACGTAGAAA GGACTACCGA CGAAGGAACT TGGGTCGCCG GTGTGTTCGT ATATGGAGGT 
AGTAAGACCT CCCTTTACAA CCTAAGGCGA GGAACTGCCC TTGCTATTCC ACAATGTCGT 
CTTACACCAT TGAGTCGTCT CCCCTTTGGA ATGGCCCCTG GACCCGGCCC ' ACAACCTGGC 
CCGCTAAGGG AGTCCATTGT CTGTTATTTC ATGGTCTTTT TACAAACTCA TATATTTGCT 
GAGGTTTTGA AGGATGCGAT TAAGGACCTT GTTATGACAA AGCCCGCTCC TACCTGCAAT 
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ATCAGGGTGA CTGTGTGCAG CTTTGACGAT GGAGTAGATT TGCCTCCCTG GTTTCCACCT 
ATGGTGGAAG GGGCTGCCGC GGAGGGTGAT GACGGAGATG ACGGAGATGA AGGAGGTGAT 
GGAGATGAGG GTGAGGAAGG GCAGGAGTGA TGTAACTTGT TAGGAGACGC CCTCAATCGT 
ATTAAAAGCC GTGTATTCCC CCGCACTAAA GAATAAATCC CCAGTAGACA TCATGCGTGC 
TGTTGGTGTA TTTCTGGCCA TCTGTCTTGT CACCATTTTC GTCCTCCCAA CATGGGGCAA 

ttgggcatac ccatgttgtc acgtcactca gctccgcgct caacaccttc tcgcgttgga 
aaacattagc gacatttacc tggtgagcaa tcagacatgc gacggcttta gcctggcctc 

CTTAAATTCA CCTAAGAATG GGAGCAACCA 

><oriP>GCATGCAGGA AAAGGACAAG CAGCGAAAAT 
TCACGCCCCC TTGGGAGGTG GCGGCATATG CAAAGGATAG CACTCCCACT CTACTACTGG 
GTATCATATG CTGAC 
Xfamily of repeats> 

TGTAT ATGCATGAGG ATAGCATATG CTACCCGGAT ACAGATTAGG 
ATAGCATATA CTACCCAGAT ATAGATTAGG ATAGCATATG CTACCCAGAT ATAGATTAGG 
ATAGCCTATG CTACCCAGAT ATAAATTAGG ATAGCATATA CTACCCAGAT ATAGATTAGG 
ATAGCATATG CTACCCAGAT ATAGATTAGG ATAGCCTATG CTACCCAGAT ATAGATTAGG 
ATAGCATATG CTACCCAGAT ATAGATTAGG ATAGCATATG CTATCCAGAT ATTTGGGTAG 
TATATGCTAC CCAGATATAA ATTAGGATAG CATATACTAC CCTAATCTCT ATTAGGATAG 
CATATGCTAC CCGGATACAG ATTAGGATAG CATATACTAC CCAGATATAG ATTAGGATAG 
CATATGCTAC CCAGATATAG ATTAGGATAG CCTATGCTAC CCAGATATAA ATTAGGATAG 
CATATACTAC CCAGATATAG ATTAGGATAG CATATGCTAC CCAGATATAG ATTAGGATAG 
CCTATGCTAC CCAGATATAG ATTAGGATAG CATATGCTAT CCAGATATTT GGG^TAGTATA 
TGCTACCC 
Xend family of repeats> 

AT GGCAACATTA GCCCACCGTG CTCTCAGCGA CCTCGTGAAT ATGAGGACCA 
ACAACCCTGT GCTTGGCGCT CAGGCGCAAG TGTGTGTAAT TTGTCCTCCA GATCGCAGCA 
ATCGCGCCCC TATCTTGGCC CGCCCACCTA CTTATGCAGG TATTCCCCGG GGTGCCATTA 
GTGGTTTTGT GGGCAAGTGG TTTGACCGCA GTGGTTAGCG GGGTTACAAT CAGCCAAGTT 
ATTACACCCT TATTTTACAG TCCAAAACCG CAGGGCGGCG TGTGGGGGCT GACGCGTGCC 
CCCACTCCAC AATTTCAAAA AAAAGAGTGG CCACTTGTCT TTGTTTATGG GCCCCATTGG 
CGTGGAGCCC CGTTTAATTT TCGGGGGTGT * TAGAGACAAC CAGTGGAGTC CGCTGCTGTC 
GGCGTCCACT CTCTTTCCCC TTGTTACAAA TAGAGTGTAA CAACATGGTT CACCTGTCTT 
GGTCCCTGCC TGGGACACAT CTTAATAACC CCAGTATCAT ATTGCACTAG GATTATGTGT 
TGCCCATAGC CATAAATTCG TGTGAGATGG ACATCCAGTC TTTACGGCTT GTCCCCACCC 
CATGGATTTC TATTGTTAAA GATATTCAGA ATGTTTCATT CCTACACTAG TATTTATTGC 
CCAAGGGGTT TGTGAGGGTT ATATTGGTGT CATAGCACAA TGCCACCACT GAACCCCCCG 
TCCAAATTTT ATTCTGGGGG CGTCACCTGA AACCTTGTTT TCGAGCACCT CACATACACC 
TTACTGTTCA CAACTCAGCA GTTATTCTAT TAGCTAAACG AAGGAGAATG AAGAAGCAGG 
CGAAGATTCA GGAGAGTTCA CTGCCCGCTC CTTGATCTTC AGCCACTGCC CTTGTGACTA 
AAATGGTTCA CTACCCTCGT GGAATCCTGA CCCCATGTAA ATAAAACCGT GACAGCTCAT 
GGGGTGGGAG ATATCGCTGT TCCTTA 

Xdyad region>GGAC CCTTTTACTA ACCCTAATTC GATAGCATAT 
GCTTCCCGTT GGGTAACATA TGCTATTGAA TTAGGGTTAG TCTGGATAGT ATATACTACT 
ACCCGGGAAG CATATGCTA 

Xend dyad>C CCGTTTAGGG TTAACAAGGG GGCCTTATAA ACACTATTGC 
TAATGCCCTC TTGAGGGTCC GCTTATCGGT AGCTACACAG GCCCCTCTGA TTGACGTTGG 
TGTAGCCTCC CGTAGTCTTC CTGGGCCCCT GGGAGGTACA TGTCCCCCAG CATTGGTGTA 
AGAGCTTCAG CCAAGAGTTA CACATAAAGG CAATGTTGTG TTGCAGTCCA CAGACTGCAA 
AGTCTGCTCC AGGATGAAAG CCACTCAGTG TTGGCAAATG TGCACATCCA TTTATT^GGA 
TGTCAACTAC AGTCAGAGAA CCCCTTTGTG TTTGGTCCCC CCCCGTGTCA Ci^TGTGGAAC 
AGGGCCCAGT TGGCAAGTTG TACCAACCAA CTGAAGGGAT TACATGCACT GCCCG 
Xend oriP> 

XHSV TK TERM 3 * END>TGACC 
AATACAAAAC AAAAGCGCTC CTCGTACCAG CGAAGAAGGG GCAGAGATGC CGTAGTCAGG 
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TTTAGTTCGT CCGGCGGC 
><pUC12 Smal-Haelll polylinker> 

GG G<Bam site is next> 



<1.65 kb Bam-EcoRI frag from pkan2 .per .bam> 
GGATCC 

XBam site made by PCR> 
GCCAGAAATCCGCGCGGTGGTTTTTGGGGGTCGG 

GGGTGTTTGGCAGCCACAGACGCCCGGTGTTCGTGTCGCGCCAGTACATGCGGTCCATGC 
CCAGGCCATCCAAAAACCATGGGTCTGTCTGCTCAGTCCAGTCGTGGACCTGACCCCACG 
CAACGCCCAAAAGAATAACCCCCACGAACCATAAACCATTCCCCATGGGGGACCCCGTCC 
CTAACCCACGGGGCCCGTGGCTATGGCGGGCTTGCCGCCCCGACGTTGGCTGCGAGCCCT 
GGGCCTTCACCCGAACTTGGGGGTTGGGGTGGGGAAAAGGAAGAAACGCGGGCGTATTGG 
CCCCAATGGGGTCTCGGTGGGGTATCGACAGAGTGCCAGCCCTGGGACCGAACCCCGCGT 
TTATGAACAAACGACCCAACACCCGTGCGTTTTATTCTGTCTTTTTATTGCCGTCATAGC 

GCGGGTTCCTTCCGGTATTGTCTCCTTCCGTGTTTCAGTTAGCCTCCCCCATCT 
<HSV1 tk terminater Smal> 

<following is EcoRI - Smal from pKan2, rc> 
xtnS neomycin phosphotransferase gene> 

<BglII-Sma, rc> 
<SmaI , AvaI>CCCGGGGTGGGCGAAGAACTCCAGCATGAGATCCCCGCGCT 
GGAGGATCATCCAGCCGGCGTCCCGGAAAACGATTCCGAAGCCCAACCTTTCATAGAAGG 
CGGCGGTGGAATCGAAATCTCGTGATGGCAGGTTGGGCGTCGCTTGGTCGGTCATTTCGA 
ACCCCAGAGTCCCGCTCAGAAGAACTCGTCAAGAAGGCGATAGAAGGCGATGCGCTGCGA 
ATCGGGAGCGGCGATACCGTAAAGCACGAGGAAGCGGTCAGCCCATTCGCCGCCAAGCTC 
TTCAGCAATATCACGGGTAGCCAACGCTATGTCCTGATAGCGGTCCGCCACACCCAGCCG 
GCCACAGTCGATGAATCCAGAAAAGCGGCCATTTTCCACCATGATATTCGGCAAGCAGGC 
ATCGCCATGGGTCACGACGAGATCCTCGCCGTCGGGCATGCGCGCCTTGAGCCTGGCGAA 
CAGTTCGGCTGGCGCGAGCCCCTGATGCTCTTCGTCCAGATCATCCTGATCGACAAGACC 
GGCTTCCATCCGAGTACGTGCTCGCTCGATGCGATGTTTCGCTTGGTGGTCGAATGGGCA 
GGTAGCCGGATCAAGCGTATGCAGCCGCCGCATTGCATCAGCCATGATGGATACTTTCTC 
GGCAGGAGCAAGGTGAGATGACAGGAGATCCTGCCCCGGCACTTCGCCCAATAGCAGCCA 
GTCCCTTCCCGCTTCAGTGACAACGTCGAGCACAGCTGCGCAAGGAACGCCCGTCGTGGC 
CAGCCACGATAGCCGCGCTGCCTCGTCCTGCAGTTCATTCAGGGCACCGGACAGGTCGGT 
CTTGACAAAAAGAACCGGGCGCCCCTGCGCTGACAGCCGGAACACGGCGGCATCAGAGCA 
GCCGATTGTCTGTTGTGCCCAGTCATAGCCGAATAGCCTCTCCACCCAAGCGGCCGGAGA 
ACCTGCGTGCAATCCATCTTGTTCAATCATGCGAAACGATCCTCATCCTGTCTCTTGATC 

xtk promoter> 

<EcoRI-BglIi, rc> 
<BglII>AGATCTGCGGCACGCTGTTGACGCTGTTAAGCGGGTCGCTGCA 
GGGTCGCTCGGTGTTCGAGGCCACACGCGTCACCTTAATATGCGAAGTGGACCTGGGACC 
GCGCCGCCCCGACTGCATCTGCGTGTTCGAATTC<EcoRI> 



<back to SW2 sequence; EcoRV site remnant> 



XM13 ori> 

TCAAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTG 
TGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCG 
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CTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGG 
GGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATT 
TGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGT 
TGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTA 
TCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAA 
ATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTT 

TATGGTGCA<ApaLI /blunt > 



xdelta 2a> 



ATTTTTATAG 

GTTAATGTCA 
GGGAAATGTG 

CGCGGAACCC 
GCTCATGAGA 

CAATAACCCT 
TATTCAACAT 

TTCCGTGTCG 
TGCTCACCCA 

GAAACGCTGG 
GGGTTACATC 

GAACTGGATC 
ACGTTTTCCA 

ATGATGAGCA 
TGACGCCGGG 

CAAGAGCAAC 
GTACTCACCA 

GTCACAGAAA 
TGCTGCCATA 

ACCATGAGTG 
ACCGAAGGAG 

CTAACCGCTT 
TTGGGAACCG 

GAGCTGAATG 
AGCAATGGCA 

ACAACGTTGC 
GCAACAATTA 

ATAGACTGGA 
CCTTCCGGCT 

GGCTGGTTTA 
TATCATTGCA 

GCACTGGGGC 
GGGGAGTCAG 

GCAACTATGG 
GATTAAGCAT 

TGGTAACTGT 
ACTTCATTTT 

TAATTTAAAA 
AATCCCTTAA 

CGTGAGTTTT 
ATCTTCTTGA 

GATCCTTTTT 



<Eco0109I/blunt>GGCCTCG TGATACGCCT 

TGATAATAAT GGTTTCTTAG ACGTCAGGTG GCACTTTTCG 

CTATTTGTTT ATTTTTCTAA ATACATTCAA ATATGTATCC 

GATAAATGCT TCAATAATAT TGAAAAAGGA AGAGTATGAG 

CCCTTATTCC CTTTTTTGCG GCATTTTGCC TTCCTGTTTT 

TGAAAGTAAA AGATGCTGAA GATCAGTTGG GTGCACGAGT 

TCAACAGCGG TAAGATCCTT GAGAGTTTTC GCCCCGAAGA 

CTTTTAAAGT TCTGCTATGT - GGCGCGGTAT TATCCCGTGA 

TCGGTCGCCG CATACACTAT TCTCAGAATG ACTTGGTTGA 

AGCATCTTAC GGATGGCATG ACAGTAAGAG AATTATGCAG 

ATAACACTGC GGCCAACTTA CTTCTGACAA CGATCGGAGG 

TTTTGCACAA CATGGGGGAT CATGTAACTC GCCTTGATCG 

AAGCCATACC AAACGACGAG CGTGACACCA CGATGCCAGC 

GCAAACTATT AACTGGCGAA CTACTTACTC TAGCTTCCCG 

TGGAGGCGGA TAAAGTTGCA GGACCACTTC TGCGCTCGGC 

TTGCTGATAA ATCTGGAGCC GGTGAGCGTG GGTCTCGCGG 

CAGATGGTAA GCCCTCCCGT ATCGTAGTTA TCTACACGAC 

ATGAACGAAA TAGACAGATC GCTGAGATAG GTGCCTCACT 

CAGACCAAGT TTACTCATAT ATACTTTAGA TTGATTTAAA 

GGATCTAGGT GAAGATCCTT TTTGATAATC TCATGACCAA 



CGTTCCACTG AGCGTCAGAC CCCGTAGAAA 
TTCTGCGCGT AATCTGCTGC TTGCAAACAA 



AGATCAAAGG 



AAAAACCACC 
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GCTACCAGCG 

GTGGTTTGTT 
TGGCTTCAGC 

AGAGCGCAGA 
CCACTTCAAG 

AACTCTGTAG 
GGCTGCTGCC 

AGTGGCGATA 
GGATAAGGCG 

CAGCGGTCGG 
AACGACCTAC 

ACCGAACTGA 
CGAAGGGAGA 

AAGGCGGACA 
GAGGGAGCTT 

CCAGGGGGAA 
CTGACTTGAG 

CGTCGATTTT 



TGCCGGATCA AGAGCTACCA ACTCTTTTTC CGAAGGTAAC 

TACCAAATAC TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA 

CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT 

AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC 

GCTGAACGGG GGGTTCGTGC ACACAGCCCA GCTTGGAGCG 

GATACCTACA GCGTGAGCAT TGAGAAAGCG CCACGCTTCC 

GGTATCCGGT AAGCGGCAGG GTCGGAACAG GAGAGCGCAC 

ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT 
TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC GAG 



><cieltal,PVU> 

<PvuII site introduced by mutagenesis; 22 8 bp PvuII fragment 
deleted> 
<join to PvuII at 4532 in RK5> 

CTGGCACGACAGGTTTCCCGA 
CTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTACCTCACTCATTAGGCACC 
CCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACA 
ATTTCACACAGGAAACAGCTATGACCATGATTAC 
GAATTAA 
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